Skip to main content

Table 1 Dataset of thirteen interacting protein-RNA pairs

From: Prediction of protein-RNA residue-base contacts using two-dimensional conditional random field with the lasso

protein sequence A

RNA sequence B

PDB code

# sequences in MSA

# contacts

UniProt

Pfam

chain

length

GenBank

Rfam

chain

length

  

≤ 3 Å

≤ 5 Å

RL18_THETH

PF00861

R

110

X01554

RF00001

B

117

2hgu

1543

28

85

RL27_THET8

PF01016

Z

81

X12612

RF01118

A

108

2hgu

1356

20

67

RL27_ECOLI

PF01016

W

77

J01695

RF01118

8

108

3kcr

1356

18

69

RL33_THET8

PF00471

5

48

X12612

RF01118

A

108

2hgu

1445

18

40

RL35_ECOLI

PF01632

3

61

J01695

RF01118

8

108

3kcr

1337

12

38

RS5_ECOLI

PF00333

E

67

J01695

RF00177

A

1530

3kc4

1701

13

57

RS7_ECOLI

PF00177

G

147

J01695

RF00177

A

1530

3kc4

1941

25

127

RS8_THET8

PF00410

K

135

M26923

RF00177

A

1515

1yl4

1889

29

93

RS10_THET8

PF00338

M

97

M26923

RF00177

A

1515

1yl4

1711

20

84

RS12_THET8

PF00164

O

122

M26923

RF00177

A

1515

1yl4

1972

45

161

RS15_ECO57

PF00312

O

83

J01695

RF00177

A

1530

3kc4

1821

21

89

RS17_ECOLI

PF00366

Q

69

J01695

RF00177

A

1530

3kc4

1690

18

85

RS17_THET8

PF00366

T

69

M26923

RF00177

A

1515

1yl4

1690

29

93

  1. For each protein-RNA pair, the identifiers of UniProt, Pfam, and the chain in PDB, the length of protein sequence A, the identifiers of GenBank, Rfam, and the chain, the length of RNA sequence B, the PDB code, the number of sequences in the multiple sequence alignment (MSA) combined on the basis of the organisms, and the number of contacts within 3 Ã… and that within 5 Ã… are shown.