BMC Systems Biology

Table 1 Number of RNA sequences in training and test datasets

From: Predicting protein-binding regions in RNA using nucleotide profiles and compositions

P:N	1:1	1:2	1:4	1:6	1:8	1:10
Training
Dataset	3,372:3,679	3,372:7,200	3,372:13,611	3,372:19,065	3,372:22,826	3,372:26,212
Subtotal	7,051	10,572	16,983	22,473	26,198	29,584
Test
Dataset	1,000:1,000	1,000:2,000	1,000:3,998	1,000:5,998	1,000:7,998	1,000:9,998
Subtotal	2,000	3,000	4,998	6,998	8,998	10,998
Total	9,051	13,572	21,981	29,435	35,196	40,582

Since similar sequences were removed separately in each 1:n dataset, the number of negative data (N) is not an exact multiple of the number of positive data (P)

Back to article page

ISSN: 1752-0509

Contact us

General enquiries: ORSupport@springernature.com