Skip to main content

Table 2 Candidate markers identified from the van de Vijver data set using the proposed method

From: Good practice guidelines for biomarker discovery from array data: a case study for breast cancer prognosis

Group

Sample size n (good prog + poor prog)

Nested CV AUROCC performance

Feature list (high expression → poor prognosis)

Feature list (high expression → good prognosis)

All patient

146 (68+78)

0.73 (0.04)

BIRC5, CCNB2, CENPA, TK1, CCNE2, DKFZp762E1312, PRC1, STK15, SLC16A3, BUB1

CEGP1, SLC11A3, C4A, ZNF145, MATN3, PGR, RAI2, DLX2

ER+

107 (57+50)

0.76 (0.05)

H1F2, COX6C, H2BFB, CCNE2, BLVRB

FST, DIO3, NTN4, DLX2, MATN3, COL3A1

Node+

64 (30+34)

0.80 (0.06)

H1F2, H2BFB, HA2FO, H2AFA, HABFB, KFZp762E1312, H2BFS

LTF, NTN4, HML2, PER1, DMBT1, ODZ2, WNT5A, SEMA3C

Node-

82 (38+44)

0.72 (0.06)

PRAME, FADSD6, TK1, TSSC3, CTSL2, BUB1

CEGP1, ESR1, CYP4B1, SEC14L2, TBX3-iso, ZNF145

ER+/Node+

50 (26+24)

0.83 (0.06)

H1F2, H2BFB, H2AFP, H2AFA, H2BFB, COX6C, MSMB, BLVRB, , BCAS1

LTF, LAMB3, C4A, NTN4, PTPRK, RTN1

  1. Many genes discovered in larger groups can also be discovered in their subgroups. For example, BIRC5 can be discovered in most of the subgroups. These genes are not listed again in subgroups unless they are more significant in the subgroups. A gene may be listed in a larger group only because it is significant in one of its subgroups. For example, H1F2 is listed in lymph node-positive group only because it is significant in ER+/Node+ subgroup. The nested CV performance is listed with estimated standard error.