From: Classification of breast cancer patients using somatic mutation profiles and machine learning approaches

Input matrix with C-scores of the top 50 mutated genes. The heat map shows the most heavily mutated 50 genes. The columns represent patients (358) and rows represent genes. One of the challenge of the dataset is being extremely sparse which can be seen in the heat map as most of the cells are colored very close to blue, which indicates a 0 (C-score) mutation score, with the exception of the first few columns. We identified that the main data structure is composed of 96 % zeros

