Skip to main content

Table 3 Success in retrieving the known transcription factor regulating an input list of its gene targets a

From: Using a large-scale knowledge database on reactions and regulations to propose key upstream regulators of various sets of molecules participating in cell metabolism

Lists of target genesb

Number of regulatory candidates consideredc

Number of tests (%) where the known TF was found among proposed gene candidates

Number of tests (%) where the known TF was found among proposed molecule candidates

 

According to coverage score

According to specificity score

According to coverage score

According to specificity score

1

11.9

12.2

13.2

14.4

10

36.3

35.9

39.0

40.3

20

44.1

45.1

47.8

49.9

50

54.9

57.7

59.1

62.6

100

66.7

67.1

70.7

72.5

200

75.5

75.9

78.8

79.7

500

82.5

82.6

86.2

86.5

1000

83.7

83.7

87.6

87.7

Lists of randomly-shuffled genes b

Number of regulatory candidates considered

Number of tests (%) where the known TF was found among proposed gene candidates

Number of tests (%) where the known TF was found among proposed molecule candidates

 

According to coverage score

According to specificity score

According to coverage score

According to specificity score

1

0.0

0.0

0.0

0.4

10

0.4

1.2

1.6

2.4

20

0.4

1.2

2.4

4.0

50

2.4

3.2

6.8

8.0

100

4.0

5.6

10.0

13.2

200

5.6

7.2

14.8

18.8

500

11.6

13.2

27.2

28.0

1000

20.0

21.2

38.4

38.8

  1. aThe rate of success corresponds to the number of situations out of 250 tests, where the known transcription factor (TF) referenced in the transcriptional regulatory database (TRED)[17] was present among n regulatory candidates that were automatically provided. Candidates were scored for coverage (i.e., the ability to explain the greatest number of targets) or for specificity (i.e., a tradeoff between the number of regulated targets and the total number of regulated molecules). Because the known TF can have the same score as a set of other candidates (i.e., ex aequo), the probability to find the known TF among the candidates was estimated under the hypothesis that ex aequo candidates were randomly ordered. The results indicate that rate of success increased with the number (n) of candidates considered.
  2. bA total of 250 different lists of genes, each of these lists being regulated by a known transcription factor, were automatically extracted from TRED. These lists contained between 1 to 352 target genes (see Additional file2: Table S2). In a first step, each list of genes targeted by a known TF was successively submitted to analysis. In a second step, all genes included in these lists were randomly shuffled to constitute 250 lists of biologically non-relevant targets.
  3. cThe results indicate that the rate of success was reasonable when 50 to 100 candidates in the answer sets were considered (as indicated in bold face).