Skip to main content

Advertisement

Table 1 List of all features.

From: Identifying essential genes in bacterial metabolic networks with machine learning methods

Short form Explanation
Topology features
  a) Deviation
RUP Reachable/Unreachable Products (RUP): equals one if all products could be produced when blocking the reaction, otherwise zero
PUP Percentage of Unreachable Products (PUP): the percentage of products which cannot be produced when blocking the reaction
ND Number of Deviations (ND)
APL Average Path Length (APL): the average path length of the deviations
LSP Length of the Shortest Path (LSP): the length of the shortest path of the deviations
  b) Local topology
NS Number of Substrates (NS)
NP Number of Products (NP)
NNR Number of Neighboring Reactions (NNR)
NNNR Number of Neighbors of Neighboring Reactions (NNNR)
CCV Clustering Coefficient Value (CCV): clustering coefficient of a reaction
DIR Directionality of a reaction (DIR)
  c) Choke points and load scores
CP Choke Point (CP): a reaction is a choke point or not (Rahman et al, 2006)
LS Load Score (LS): load score of a reaction (Rahman et al, 2006)
  d) Damage
NDR Number of Damaged Reactions (NDR) (Lemke et al, 2004)
NDC Number of Damaged Compounds (NDC) (Lemke et al, 2004)
NDRD Number of Damaged Reactions having no Deviations (NDRD): the number of damaged reactions that have no other alternative paths to be reached after blocking a reaction
NDCD Number of Damaged Compounds having no Deviations (NDCD): the number of damaged compounds that have no other alternative paths to be reached after blocking a reaction
NDCR Number of Damaged Choke point Reactions (NDCR)
NDCC Number of Damaged Choke point Compounds (NDCC)
NDCRD Number of Damaged Choke point Reactions having no Deviations (NDCRD): the number of damaged choke point reactions that have no other alternative paths to be reached after blocking a reaction
NDCCD Number of Damaged Choke point Compounds having no Deviations (NDCCD): the number of damaged choke point compounds that have no other alternative paths to be reached after blocking a reaction
  e) Centrality
BW Betweenness centrality
CN Closeness centrality
EC Eccentricity centrality
EV Eigenvector centrality
Genomic and transcriptomic features
  f) Homologs
NAR Number of Associated Reactions (NAR): the number of reactions that base on the knocked-out gene
Hn Homology at different expectation values: the number of homologous genes with e-value cutoff 10-30,10-20,10-10,10-7,10-5,10-3 (H30, H20, H10, H7, H5, H3)
  g) Gene expression
NGSE Number of Genes having Similar Expression (NGSE): the number of genes that have similar expression (correlation coefficient >0.8)
MCC Maximum of Correlation Coefficients (MCC): maximum value of the correlation coefficients for all neighboring genes
  h) Phyletic retention
PR Phyletic Retention (PR): the number of orthologs in the other prokaryotes
  i) Codon usage
Nc Number of codons
N3s Base composition at silent sites (T3s, C3s, A3s, G3s)
glt The frequency of amino acids glutamine (exemplarily)