Predicting metaheuristic performance on graph coloring problems using data mining
A metaheuristic approach is introduced to find out which computer algorithms perform best for complex problems. As a case study, 5000 different graphs are randomly generated and two different graph coloring algorithms are applied to them. Viscovery SOMine is used to order the graphs by 16 graph measures and to analyze how the performance of the algorithms depends on these measures.
A discussion on visual interactive data, exploration using self-organizing maps
This article provides an overview of state-of-the-art software tools for self-organizing map-based visual data exploration. Viscovery SOMine gets best grades for data preprocessing and interaction with the map and above average grades for interaction with data and visualization, as well as label assignment.
Meta-learning of instance selection for data summarization
This article analyzes how to select a smaller subset from a large data set without losing much information. An instance selection method using k-means clustering is applied to 112 classification data sets with different compression rates. Viscovery SOMine is used to cluster the data sets with respect to their statistical properties and to analyze the classification accuracy of a naive Bayes classifier with respect to the compression rate. This model enables the optimal compression rate to be predicted for new data sets.
Generalising algorithm performance in instance space: a timetabling case study
The performance of two timetabling algorithms is studied on a mix of 21 real-world and 8178 computer generated timetabling problems of university courses. The timetabling problems are characterized by 21 meta-features (such as number of courses, number of rooms, graph- theoretical measures) and clustered with Viscovery SOMine. The resulting model shows how real world and computer generated problems differ and which algorithm performs better for which kind of problems in terms of the meta-features.
Characteristic-based clustering for time series data
This paper proposes a feature-engineering method for clustering time series based on their structural characteristics. Viscovery SOMine's SOM-Ward algorithm is used alongside complete linkage, k-means and fuzzy c-means clustering to test the generated features.
Data visualization of asymmetric data using Sammon mapping and applications of self-organizing maps
The performance of several software implementations of methods based on self-organizing maps is evaluated. Viscovery SOMine is found to be helpful in determining the number of clusters and recovering the cluster structure of data sets. A genocide and politicide data set is analyzed using Viscovery SOMine, followed by another analysis using public and private college data sets with the goal to identify schools with best values.
A scalable method for time series clustering
Global measures to compare (long) time series are introduced. The self-organizing map is used for additional dimension reduction and, finally, the time series are clustered using Viscovery's SOM-Ward algorithm.
A comparison of software implementations of SOM clustering procedures
This review presentation compares the clustering possibilities of Viscovery SOMine with those in SOM_PAK and k-means clustering implemented in the SPSS Clementine software package. The Ward algorithm of Viscovery SOMine and its modified version resulted in the best cluster recovery rates and Rand statistic values of all considered methods.