Skip to main content

Table 1 Data sets from MAQC project used in this work.

From: Embracing noise to improve cross-batch prediction accuracy

  

Training set

Validation set

Data set code

Data set description

Number of samples

Positives

Negatives

Number of Samples

Positives

Negatives

A

Lung tumorigen vs. nontumorigen (Mouse)

70

26

44

88

28

60

D

Breast cancer pre-operative treatment response (pathologic complete response)

130

33

97

100

15

85

F

Multiple myeloma overall survival milestone outcome

340

51

289

214

27

187

I

Same as data set F but class labels are randomly assigned

340

200

140

214

122

92