customer profiles and target groups
areas in which Viscovery Data Mining Suite is already solving problems
Self-organizing maps (SOMs, also referred to as Kohonen maps) are used to create an ordered representation of multi-dimensional data which simplifies complexity and reveals meaningful relationships. SOMs are a particularly robust form of unsupervised neural networks that, since their introduction by Prof. Teuvo Kohonen in the early 1980s, have been the technological basis of countless applications as well as the subject of many thousands of publications.
The SOM method can be viewed as a non-parametric regression technique that converts multi-dimensional data spaces into lower dimensional abstractions. Much like a regression plane being an abstraction of the original data, a SOM generates a representation of the data distribution, however, with the crucial difference that this representation is non-linear.
For data mining purposes, it has become a standard to approximate the SOM by a two-dimensional hexagonal grid. The “nodes” on the grid are associated so-called “reference vectors” which point to distinct regions in the original data space. Starting with sets of numerical, multivariate data, these reference vectors on the grid gradually adapt to the intrinsic shape of the data distribution, whereby the reference vectors of neighbored nodes point to adjacent regions in the data space. Thus the order on the grid reflects the neighborhood within the data, such that data distribution features can be read directly from the emerging landscape on the grid.

This powerful method of data representation is provided by many leading data mining suites on the market. For the Viscovery system, the SOM method is the basis on which a multitude of analytical and statistical techniques are applied. Viscovery systematically combines SOMs with classical statistical methods in an intuitive visual environment that allows anyone to understand the resulting analytical model, regardless of their statistics background: there is no need for familiarity with the basic Kohonen algorithm. In Viscovery, the details of the SOM creation process are shielded from the user, who is guided through the application in an environment of well-balanced settings and defaults.
In Viscovery, the data representation contained in the trained SOM is systematically converted for use across a broad spectrum of visualization techniques. When Viscovery is used to evaluate dependences, to investigate properties of the data distribution, to search for clusters, or to monitor new data – just to mention a few options – an intuitive and inspiring interactive process emerges.
In addition to the capabilities for data exploration, Viscovery employs a multitude of statistical techniques for the creation and application of classification and prediction models, all embedded in a workflow-guided project environment. The Viscovery data mining products offer comprehensive technical features for the generation of predictive models, such as scoring models or segmentations, as well as their application and real-time integration into an operational environment.
Much of the theoretical background as well as of innovative algorithms in the field of SOMs is owed to Prof. Kohonen, who, as the Head of the Laboratory of Computer and Information Science at the Helsinki University of Technology, prominently contributed to the creation, evolution, and spread of SOM technology. As the originator of several new concepts, Prof. Kohonen is the author of hundreds of scientific papers as well as of several text books, among them the standard lecture book on “Self-Organizing Maps”. His manifold contributions to scientific progress have been multiply awarded and honored.
A SOM may be the most compact way to represent a data distribution. Because SOMs represent complex data in an intuitive two-dimensional perceptional space, data dependences can be understood easiliy if one is familiar with the map visualization. The following example provides an intuitive explanation of the basics of Viscovery visualization.

Imagine 1000 people on a football field. We define a number of attributes (e.g. gender, age, family status, income) and ask the people on the field to move closer to other people who are most similar to them according to all these attributes. After a while, everyone on the field is surrounded by those people that share similar attribute values. This configuration is an example of a two-dimensional representation of multi-dimensional data points.

Now imagine that, looking over the crowd, you ask everyone to raise a colored flag according to their age (blue for <20, green for 20 to 29, yellow for 30 to 39, orange for 40 to 49, and red for 50 and over). The pattern of color that you see corresponds to the distribution of the attribute “Age” in the football field. Next you ask the crowd to remain in place and raise a colored flag according to their income, and so on for other attributes. For each attribute, you take a photo of the color distribution in the field. This color pattern corresponds to the color-coded maps visualized within Viscovery software.

Finally, you can put all the photos side by side and inspect the dependences. For example, you might see clusters of younger people (blue/green) as well as clusters of older people (orange/red). Further, you could detect some correlation between age clusters and income clusters: e.g., higher incomes occur in older groups. Continuing in this manner, you will discover further relationships among the defined attributes.
The unique SOM representation and visualization are powerful instruments for data modeling and exploration. However, the above mentioned visualization is just the starting point for much more extensive and in-depth data mining and predictive modeling.
Through the combination of the compact SOM data representation with the strength of classical statistics, Viscovery provides a unique approach to data analysis and predictive modeling which is unique in terms of intuition and effectiveness. The following have been chosen from a multitude of analytics capabilities to provide an overview of some prominent fields of application.

Clustering
SOMs simplify clustering and allow the user to identify homogenous data groups visually. In Viscovery, several clustering algorithms (SOM Single Linkage, Ward, and SOM-Ward) are available for automatically building clusters.

Prediction
Viscovery combines the non-linear data representation of the SOM with linear statistical prediction methods for each homogenous sub-group to improve prediction accuracy.

Data representation
Data are highly compressed using statistical methods, allowing a single map that uses only a few megabytes of space to represent databases that are orders of magnitude larger.

Real-time classification
New data can be located in the map extremely quickly — up to 100,000 previously unseen data records can be classified per second — allowing real-time assessment of new data.
Viscovery is the leading commercial solution for data mining applications based on SOMs. Advantages in terms of technological superiority include the following:
| Learn more about features and benefits, solutions and applications, and publications about the Viscovery software. |
