Skip to main content

Table 1 Number of RNA sequences in training and test datasets

From: Predicting protein-binding regions in RNA using nucleotide profiles and compositions

P:N

1:1

1:2

1:4

1:6

1:8

1:10

Training

      

Dataset

3,372:3,679

3,372:7,200

3,372:13,611

3,372:19,065

3,372:22,826

3,372:26,212

Subtotal

7,051

10,572

16,983

22,473

26,198

29,584

Test

      

Dataset

1,000:1,000

1,000:2,000

1,000:3,998

1,000:5,998

1,000:7,998

1,000:9,998

Subtotal

2,000

3,000

4,998

6,998

8,998

10,998

Total

9,051

13,572

21,981

29,435

35,196

40,582

  1. Since similar sequences were removed separately in each 1:n dataset, the number of negative data (N) is not an exact multiple of the number of positive data (P)