The Enigma of SimilaritySimilarity judgments lie at the very heart of good decision-support capabilities"Similar" is a foundational notion. The correct construal of the similarity between people, objects, events, products, legal issues, options, and strategies is fundamental to the success and survival of businesses and people alike. Yet, the notion of similarity is an enigma. The human judgment of similarity is a highly complex process, generally not fully accessible to conscious observation and highly dependent on application, need, and context. By looking at four computational principles observable in the human cortex and how they appear in five differing computer-based decision and recognition algorithms, I'll illustrate some of the rudimentary issues involved in similarity judgments. Computational PrinciplesThe four computational principles follow:
Examples of head algorithms include discriminate analysis, decision trees, and neural nets (in order of increasing flexibility). I call them head algorithms because they focus exclusively on partitioning a space defined by the original raw, untransformed data into what seems to predict a desired difference. A strong example is ParallAX. The Principles in ActionMDG Ltd.'s ParallAX software combines visualization and data analysis capabilities with a powerful head algorithm. Published tests indicate this algorithm is faster and more accurate than many competing decision algorithms. ParallAX uses parallel coordinates to visualize many dimensions without loss. Each vertical (red) line represents a dimension (see Figure 1) and each piecewise linear (black or blue) line represents a record with the value on each dimension represented by the intercept on that dimension. In the data exploratory phase, you can define subsets of the data and see how it affects point segmentation in a scatter plot of labeled data (black vs. blue). (See the scatter plot window in Figure 1.) You can invoke these subsets by marking ranges on individual dimensions, finding local correlations between variables on the parallel coordinates, marking or wrapping regions in two-dimensional scatter plots, and combining all these with set operations. However, the software's decision algorithms combine all these automatically to find the optimal rules to group similar with similar, given your basic input data.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||





















