Skip to main content
Figure 1 | BMC Systems Biology

Figure 1

From: Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

Figure 1

Modified k-prototypes clustering of mixed data types. a) The data sets used for clustering and the components of the modk-prototypes algorithm. The type of the data is denoted in parentheses. b) The k-prototypes algorithm was modified (termed modk-prototypes) to include B iterations of the assignment of the samples to the k number of clusters for each k = 2 to N number of samples. d(X i , Q l ) is the dissimilarity function between the ith sample and the lth cluster prototype. The cluster prototypes are updated and the samples are reassigned repeatedly until there is no more change in cluster assignment. The validity score is computed for the final assignment of the samples. The number of clusters in the data is estimated by finding the assignment of the samples (over all B initializations and all k partitions) that yielded the optimal validity score.

Back to article page