Skip to main content

Table 2 Examples of named substructures and appearance in KEGG COMPOUND, KEGG DRUG and KNApSAcK databases.

From: KCF-S: KEGG Chemical Function and Substructure for improved interpretability and prediction in chemical bioinformatics

KCF-S / annotation COMPOUND DRUG KNApSAcK
  #S / #C #S / #C #S / #C
BOND    
C5a-N1b / amide bond 4174 / 2192 2678 / 1385 6784 / 2528
C7a-O7a / carboxylate ester bond 3040 / 2198 1787 / 1329 21857 / 13166
C5a-S2a / thioester bond 455 / 453 31 / 30 36 / 36
N2b-N2b / diazo bond 83 / 73 83 / 19 11 / 11
S3a-S3a / disulfide bond 40 / 37 40 / 26 43 / 33
N1b-N1b / hydrazine bond 15 / 13 22 / 15 3 / 3
TRIPLET    
C6a-C1c-N1a / alpha-amino acid 512 / 484 113 / 104 191 / 183
C5a-C1b-C5a / beta-keto carbonyl 270 / 106 6 / 6 36 / 36
C6a-C5a-O5a / alpha-keto carboxylate 169 / 168 10 / 8 46 / 46
C6a-C1c-O1a / alpha-hydroxy carboxylate 167 / 154 236 / 137 108 / 87
VICINITY    
C1y(C1y+C1y+O1a) / cyclic secondary alcohol 10099 / 3090 1171 / 388 49015 / 11697
C8y(C8x+C8x+O1a) / phenolic hydroxy 1562 / 1263 376 / 313 9978 / 7219
C5a(N1b+N1b+O5a) / pseudourea 66 / 65 82 / 77 46 / 43
N1c(C1b+C1b+C1b) / tertiary amine 54 / 48 302 / 235 0 / 0
C5x(N1x+N1x+O5x) / cyclic pseudourea 36 / 36 30 / 29 20 / 20
RING    
C1y(C1b)-C1y(O1a)-C1y(O1a)-C1y(O1a)-C1y(O2a)-O2x / pyranose sugar ring 1024 / 824 64 / 54 7670 / 6187
C8x-N4y(C1y)-C8y(N5x)-C8y(C8y)-N5x / imidazole ring 549 / 535 48 / 47 84 / 84
C8x-N4y(C1y)-C8y-N5x-C8x-N5x-C8y(N1a)-C8y-N5x / adenine ring 428 / 420 17 / 17 55 / 55
C1x-C1x-N1y(C1b)-C1x-C1x-N1y(C1b) / piperazine ring 7 / 7 45 / 45 0 / 0
C8x-C8y(C2b)-C8x-C8y(O1a)-C8y(O1a)-C8y(O1a) / 5-alenylbenzene-1,2,3-triol 3 / 3 0 / 0 12 / 12
SKELETON    
C1b(O2b)-C1y(O2x)-C1y(O1a)-C1y(O1a)-C1y(N4y+O2x) / ribofuranose 255 / 255 20 / 20 62 / 62
C1x(N1y)-C1x(N1y) / ethylenediamine in ring 136 / 136 702 / 702 0 / 0
C1a-C1c(C1a)-C1b-C1c(N1b)-C5a(N1b+O5a) / leucine residue 102 / 102 79 / 79 228 / 228
C7a(O6a+O7a)-C8y-C8x-C8x-C8y(O2a)-C8x-C8x / p-hydroxybenzoate residue 0 / 0 3 / 3 51 / 51
INORGANIC    
O1c-P1b(O2b(C1y))(O1c)-O1c 520 / 520 19 / 19 66 / 66
/ cyclic secondary alcohol orthophosphate    
O1c-P1b(O2b(C1b))(O1c)-O1c 387 / 387 43 / 43 97 / 97
/ primary alcohol orthophosphate    
O1c-P1b(O2b(C1y))(O2b(C1b))-O1c / cyclic orthophosphate 173 / 173 2 / 2 2 / 2
O3a-N2b(C8y)-O3a / aryl nitro 304 / 304 164 / 164 48 / 48
N2b(C2c)-O1b / oxime 27 / 27 22 / 22 61 / 61
  1. #S represents the numbers of KCF-Substructures, and #C represent the numbers of compounds containing the KCF-Substructures. Note that the annotations are not necessary-and-sufficient definitions. For example, "N1b-N1b" bond is a hydrazine bond, but there are some other types of hydrazine bonds; e.g., "N1b-N1c" is a hydrazine bond with three substituted groups, and "N1x-N1x" is a hydrazine bond in a ring structure.