Skip to main content

Table 1 Data sets from MAQC project used in this work.

From: Embracing noise to improve cross-batch prediction accuracy

   Training set Validation set
Data set code Data set description Number of samples Positives Negatives Number of Samples Positives Negatives
A Lung tumorigen vs. nontumorigen (Mouse) 70 26 44 88 28 60
D Breast cancer pre-operative treatment response (pathologic complete response) 130 33 97 100 15 85
F Multiple myeloma overall survival milestone outcome 340 51 289 214 27 187
I Same as data set F but class labels are randomly assigned 340 200 140 214 122 92