BMC Systems Biology

From: GeneTopics - interpretation of gene sets via literature-driven topic models

KEGG Metabolic pathway gene sets. The GeneTopics algorithm was applied to 236 KEGG metabolic pathway gene sets. (a) Three pre-defined number of topics were used in the validation for each gene set, the pie chart shows the distribution of gene sets for which the optimal number was determined. (b) Number of relevant topics increases as the number is used to build the optimal LDA model. (c) The size of the gene set and the number relevant topics found have a positive correlation. All these results are consistent with expectation intuition and indicate that the GeneTopics algorithm operates properly and is suitable for large-scale analyses.

