Integrative inference of gene-regulatory networks in Escherichia coli using information theoretic concepts and sequence analysis
© Kaleta et al; licensee BioMed Central Ltd. 2010
Received: 15 January 2010
Accepted: 18 August 2010
Published: 18 August 2010
Although Escherichia coli is one of the best studied model organisms, a comprehensive understanding of its gene regulation is not yet achieved. There exist many approaches to reconstruct regulatory interaction networks from gene expression experiments. Mutual information based approaches are most useful for large-scale network inference.
We used a three-step approach in which we combined gene regulatory network inference based on directed information (DTI) and sequence analysis. DTI values were calculated on a set of gene expression profiles from 19 time course experiments extracted from the Many Microbes Microarray Database. Focusing on influences between pairs of genes in which one partner encodes a transcription factor (TF) we derived a network which contains 878 TF - gene interactions of which 166 are known according to RegulonDB. Afterward, we selected a subset of 109 interactions that could be confirmed by the presence of a phylogenetically conserved binding site of the respective regulator. By this second step, the fraction of known interactions increased from 19% to 60%. In the last step, we checked the 44 of the 109 interactions not yet included in RegulonDB for functional relationships between the regulator and the target and, thus, obtained ten TF - target gene interactions. Five of them concern the regulator LexA and have already been reported in the literature. The remaining five influences describe regulations by Fis (with two novel targets), PhdR, PhoP, and KdgR. For the validation of our approach, one of them, the regulation of lipoate synthase (LipA) by the pyruvate-sensing pyruvate dehydrogenate repressor (PdhR), was experimentally checked and confirmed.
We predicted a set of five novel TF - target gene interactions in E. coli. One of them, the regulation of lipA by the transcriptional regulator PdhR was validated experimentally. Furthermore, we developed DTInfer, a new R-package for the inference of gene-regulatory networks from microarrays using directed information.
Gene regulation represents a central mechanism in the control of the phenotype of an organism. Thus, the comprehension of gene regulatory mechanisms is a central topic in Systems Biology . The prokaryote Escherichia coli is best suited as a model organism for genome-wide network inference studies due to the available and well-documented molecular biological knowledge and the remarkable amount of published genome-wide data. Relevance or association networks  are widely used for genome-wide network inference. They require, first, a measure to evaluate association of pairs of genes, second, a threshold to cut off irrelevant associations, and, third, a criterion or algorithm to discriminate between direct and indirect interactions. The ready-to-use algorithms ARACNE [3, 4], Context Likelihood of Relatedness (CLR, ) and MRNET  use mutual information (MI) as the association measure. A drawback of MI is represented by the fact that it is an undirected measure. That is, to derive causal relations from the inferred associations between interacting nodes, further information is necessary, in particular, to qualify one node as the regulator and the other as the target. There are several ways to derive a causal interaction from an inferred association: First, one can integrate prior knowledge. In  the inferred interactions are restricted to cases where one partner is a transcription factor (TF). Another approach is to use active and gene-specific interventions, like knockouts, knockdowns or over expressions. A third way is to exploit time series data and use them to infer the direction of association from temporal patterns. In this context directed information (DTI, ) can be used. DTI is an extension of the concept of MI that allows to measure the direction of an information flow between two random variables. It has been used earlier to infer gene regulatory mechanisms in kidney development . In this work we improved the computation of DTI and used it to infer regulatory networks on a genome scale.
A second important step in the inference of gene-regulatory networks is the integration of additional knowledge. This process allows one to reduce the number of false positive predictions. One such approach is the integration of information extracted from genome sequence data. For predicted interactions between TFs and genes it is, for instance, possible to align the promoter regions of the predicted targets of a specific TF with each other to detect overrepresented motifs . One possible explanation of such overrepresented motifs is that they correspond to a binding site of a common TF. On the other hand, if some binding sites of a TF are already known, the promoter regions of the putative target genes can be searched for sequences resembling these known binding sites. However, the sequences of binding sites can be very heterogeneous. In consequence, a binding site can be additionally validated by checking its phylogenetical conservation over several species . This approach was used in this work.
A third step to reduce the number of false positive interactions is to integrate prior knowledge in form of known functional relationships between the regulator and the predicted target into the inference procedure. Finally, the predicted interactions have to be verified experimentally. In order to select a candidate interaction to verify we chose the regulation of a gene by a transcription factor whose targets are most suitably detected by our method. Thus, the present work demonstrates the full cycle of systems biological work, from genome-wide data analysis via large-scale modeling, to prediction of testable hypothesis by studying certain regulatory modules of interest, and, finally, to the prediction and experimental validation of novel molecular mechanisms.
Results and Discussion
The inferred interactions were validated by a search for a phylogenetically conserved binding site of the TF upstream of its putative target. The validated interactions were manually enriched for functional relationships between the regulator and the targets to select candidates for an experimental verification. Finally, the most promising regulatory interaction between PdhR and lip A was experimentally verified by an electrophoretic mobility shift assay (EMSA).
Validation of predicted interactions by sequence analysis
The predicted TF - gene interactions were validated by searching for phylogenetically conserved transcription factor binding sites (TFBS) in the promoter region of the presumed target genes. To this end known binding sites of the regulator are aligned with the promoter region of the putative target using the motif discovery tools cosmo. If a region that resembles the known binding sites of the regulator was discovered, we checked whether this region overlaps to more than 50% with a phylogenetically conserved region of the genome. If we found such an overlap, the interaction was accepted. For more information on the search for binding sites see Methods and Additional File 1: Supplemental Material S4. This leads to 109 accepted interactions, of which 65 are known according to RegulonDB [Additional File 1: Supplemental Material S7]. While the total number of predicted interactions dropped from 878 to 109, the fraction of known interactions increased from 166/878 = 19% within the network inferred using DTI to 65/109 = 60% when additionally requiring the presence of a phylogenetically conserved binding site of the regulator. Thus, the search for phylogenetically conserved binding sites reduces the number of inferred interactions to a much smaller set which is supported by additional evidence. However, due to this step we might also loose true positive interactions since only about one third of the TFBS overlap to more than 50% with a conserved region of the E. coli genome [Additional File 1: Supplemental Material S4].
Predicted and functionally related interactions
Predicted targets of LexA
The search for phylogenetically conserved binding sites for predicted interactions identified five new targets of the transcription factor LexA: cho, din B, din I, din D and yebG. These five genes have been reported previously to be regulated by LexA [12, 16], but were not yet included in Regulon DB 6.1.
We found strong evidence, both on an expression and phylogenetic level, for the regulation of lipA by PdhR. PdhR is an important regulator of central metabolism by controlling the transcription of the components of the pyruvate dehydrogenase complex and several genes involved in the respiratory chain . LipA encodes the lipoate synthase which catalyzes the last step in lipoate biosynthesis and incorporation. Lipoate is an important co-factor of LpdA that is contained in the the pyruvate dehydrogenase complex, oxoglutarate dehydrogenase and the glycine cleavage complex .
According to RegulonDB, PhoP binds in the promoter regions of 31 genes. Among the genes regulated by PhoP are two genes involved in methionine biosynthesis. One of the corresponding enzymes, encoded by metB, catalyzes the step of the incorporation of sulfur contained within cysteine into O-succinyl-L-homoserine to produce cystathionine, subsequently converted into methionine. A putative phylogenetically conserved binding site of PhoP in the upstream region of cysB was detected. CysB encodes a TF regulating several genes necessary for the production of cysteine from which methionine is synthesized in E. coli.
A newly predicted target of KdgR is edd encoding a gluconate dehydratase in the Entner-Doudoroff pathway. While a binding site of KdgR in the upstream region of eda, the Entner-Doudoroff aldolase, is known , hitherto no binding site upstream of edd which precedes eda on the chromosome has been reported. Regulation by KdgR induces eda if glucuronate, galacturonate, or methyl-β-d-glucuronide are present in the growth media . The activation of eda allows the growth on these compounds. The existence of a binding site upstream of edd would furthermore allow a control of the metabolic flux into the pentose-phosphate-pathway.
Selection of candidates and experimental verification
In order to select a candidate for the experimental validation of a predicted interaction we compared the z-scores and the average phylogenetical conservation of known binding sites of each TF (Table 1) for the interactions reported in the last section. Especially, the TFBS of LexA and PdhR are well conserved. Furthermore, the z-scores of the interactions of these regulators are the highest. Since the predicted targets of LexA have already been experimentally verified in [12, 16], we thus chose the regulation of lip A by PdhR as best candidate for an experimental validation of a predicted interaction.
R-package for Network Inference
The methods for the inference of gene regulatory networks presented in this work have been implemented in an R-package that can be downloaded from http://users.minet.uni-jena.de/~m3kach/DTInfer/[Additional File 2]. Given the time-courses of expression values for a set of genes over one or several experiments, DTI values are computed between arbitrary sets of genes. Additionally it is possible to compute MI values. DTIs and MIs can be estimated using one of two estimators; a kernel density estimator we implemented based on the work of  and a b-spline estimator based on the work of , implemented by Boris Hayete and provided by courtesy of the Gardner Lab of the Berkeley University. Significance of the MI or DTI values, as well as those provided by the user are assessed through the Context Likelihood of Relatedness algorithm presented in . Finally, interactions are inferred either at a certain precision (see Methods), for a given number of interactions or a user-defined threshold.
In this work we used directed information (DTI) to infer transcription factor gene interactions in E. coli. In contrast to previous works using DTI  we improved the inference procedure in several points. First, we used a more precise algorithm for the computation of mutual information required for the estimation of DTI. Second, we used CLR  in order to determine the significance of the DTI values. This step is necessary to remove interactions starting or ending in genes which have high DTI values with many other genes. Third, we validated the inferred interactions by the search for phylogenetically conserved transcription factor binding sites. Especially this last step allows to drastically increase the fraction of true positives in the set of inferred interactions. Finally, by additionally requiring a functional relationship between regulator and target, we extracted a set of ten TF - gene interactions of which five are unknown in the literature. We predicted that PhoP regulates cysB encoding a global regulator of cysteine biosynthesis, KdgR putatively regulates edd encoding the gluconate dehydratase, Fis putatively regulates rpsI and rplM encoding two ribosomal proteins, and PdhR regulates lipA encoding the lipoate synthase.
Experimentally validating the most likely candidate of a predicted interaction we were able to shed new light on the regulation of central metabolism. We found that the transcription factor PdhR does not only regulate the expression of the the pyruvate dehydrogenase (PDH) multi-enzyme complex, but additionally controls the production of the co-factor lipoate required for the activity of this enzyme complex by regulating the expression of the lipoate synthase LipA. Thus, these new findings further emphasize the role of pyruvate-sensing PdhR in the control of the activity of LpdA, the E3 component of the pyruvate dehydrogenase complex, the oxoglutarate dehydrogenase complex and the glycine cleavage complex. Moreover they underline the key role of the regulator PdhR in the control of fluxes at the pyruvate node that connects glycolysis, citric acid cycle and lipid metabolism.
Additional to the ten predicted interactions, we found eight cases for which we did not detect a phylogenetically conserved binding site, but we could support our prediction using data from the literature [Additional File 1: Supplemental Materials S8 and S9]. In one case a binding site has been detected independently. In seven cases an alternative operon structure reported in the literature supports the predicted interactions.
In conclusion, our work demonstrates the importance of integration of different types of data and prior knowledge into network inference algorithms in order to stringently plan new experiments that are able to identify hitherto unknown molecular interactions in gene regulatory networks. We started from a large compendium of gene-expression experiments and inferred 878 putative regulatory interactions. By probing these predicted interactions with independent knowledge from phylogenetic and sequence data we were able to narrow down the list of potential interactions to a smaller list of 109 validated interactions, which could be surveyed by manual inspection. Of the 44 interactions contained in this list, which were not yet present in RegulonDB 6.1, we identified ten interactions where we could also identify a functional relationship between the regulator and the target. Of these ten targets, five were already reported in the literature. Thus, we narrowed down the list of 878 interactions to five very likely targets that should be verified by experiment. Finally, genome-wide data analysis and modeling was the driving force to design experiments for the discovery of the regulation of lipA by PdhR. In consequence, our approach further emphasizes the vital importance of the combination of different bioinformatics methods for saving resources in experimental work by in silico selection of most likely candidates for the time-consuming and expensive procedure of experimental verification.
where Y n denotes (Y1,Y2, ..., Yn), that is, a segment of the realization of the random sequence Y. DTI can be interpreted as the mutual information between the time course of X to the current point n and the current value of Y given all values of Y up to the previous instant n - 1. Since we are summing over all time-points we are taking into account the relationship for every time point.
where 0Y n-1 denotes the concatenation of 0 and Yn-1, i.e., (0,Y1,Y2,...,Yn-1). This concatenation is equal to considering pairs of (X2,Y1), (X3,Y2),...,(Xn,Yn-1) of expression values in which the X-values are shifted one time-step into the future. In consequence, directed information can also be understood as the mutual information between X and Y subtracted by the information flow between the time series of Y shifted one step and X. Hence, by subtracting the causal (shifted) relationship from Y to X, the causal dependency from X to Y remains.
To evaluate the mutual information term in equation (2), a b-spline estimator based on the work of  and implemented by Boris Hayete of the Gardner Lab as part of the CLR algorithm has been used. More details on the implementation of the DTI estimator can be found in Additional File 1: Supplemental Material S1.
Context Likelihood of Relatedness (CLR)
Having computed a DTI value, the significance of the value needs to be determined. That is, the probability that the DTI indicates a true dependency is to be assessed. The complementary event, the null-hypothesis, is represented by a DTI value that can be obtained from the expression series of randomly chosen non-interacting genes. The null-distribution of the DTI-values for a given context, i.e., the distribution of DTI values for two independent genes, are not known. Hence they need to be estimated.
A method to perform this estimation is represented by the context likelihood of relatedness (CLR) algorithm . CLR is an extension of the relevance networks approach  and has first been proposed by  for cluster-analysis. This approach makes explicit use of the data to estimate the null-distribution. The assumption underlying the approach is that there is no interaction between most gene pairs. Hence, the null-distribution of the DTIs can be obtained from the whole set of DTIs determined from a potential regulator to all other genes.
Furthermore, when using CLR, we do not consider only the value of the DTI within the set of potential regulators of a target gene. Thus, two z-scores are computed. The first is the z-score of the DTI within the null-distribution of DTIs for all potential targets of a regulator and the second the z-score of the DTI within the null-distribution of all potential regulators of a target gene. A cumulative z-score is computed as the quadratic mean of both z-scores. For TF-gene interactions, these z-scores are computed only within the matrix of TF-gene interactions.  in contrast computed z-scores within the full MI matrix and then extracted all those concerning interactions where one partner is a TF. More details on the implementation of the CLR algorithm are given in Additional File 1: Supplemental Material S2.
An interaction is accepted if the cumulative z-score is above a certain threshold. Similar to  this threshold is determined using precision, which is defined as the fraction of known interactions within the set of inferred interactions. However, since not all TFs and genes are equally well studied, we use only a subsystem of the inferred network that contains genes having known regulators or TFs having known targets as a reference. Thus, for the computation of precision, edges corresponding to TFs or genes without known targets or regulators, respectively, are removed. Then, we determine the known interactions by a comparison of inferred TF - gene interactions to the interactions contained within RegulonDB. Finally, we compute precision as the number of known interactions divided by the number of inferred interactions in the reduced graph.
Sequence-based validation of TF - gene interactions
Inferred interactions are validated through independent evidence. This process helps to reduce the number of detected interactions to a smaller set containing a higher fraction of true interactions. A direct approach is to search for putative binding sites of the regulator in the promoter region of the target gene. This process is separated into two steps. First, a putative binding site of the TF is searched in the promoter region of the target gene. Then, this binding site is checked for phylogenetical conservation.
The discovery of binding sites can be performed using various approaches [11, 25, 26]. Here, the R-package cosmo was used. cosmo allows us to detect overrepresented motifs in DNA sequences. Binding sites are detected by passing known binding sites of the TF along with a stretch of 400 base pairs upstream of the start site of the presumed target gene to cosmo (for more details on the detection of binding sites see Additional File 1: Supplemental Material S4).
In order to validate the predicted binding site its phylogenetical conservation over different species is checked. Phylogenetically conserved regions upstream of genes in ten proteobacterial genomes have been identified in [27, 28]. The assumption that underlies this analysis is that TFBSs are under a positive selective pressure and hence can be identified by comparing stretches of upstream regions of orthologous genes in several species. If a conserved region overlaps to more than 50% with a putative binding site, the interaction is accepted.
List of experiments
Time points (min)
ccdB overexpression, o-phenanthroline chelator, recA knockout, different E. coli strains
0, 30, 60, 90, 120
0, 30, 60, 90, 120
0, 30, 60, 90
cc dB-chelator W1872*
0, 30, 60, 120
cc dB-chelator MG1063*
0, 30, 60, 120
0, 30, 60, 120, 180
0, 30, 60, 90
0, 30, 60, 120, 180
lacZ up-regulation after induction, different E. coli strains
0, 30, 60, 90, 120
0, 30, 60, 90
0, 30, 60, 90
0, 30, 60, 90
norfloxacin, recA knockout, different E. coli strains
0, 30, 60, 120
0, 30, 60, 120
0, 30, 60, 120, 180
0, 30, 60, 120, 180
0, 30, 60, 120, 180
0, 30, 60, 120, 180
0, 30, 60, 90
The pdhR gene was amplified by standard PCR with a pair of primers, PdhR+ (3'-CTGCAGGAACTCATGGCCTACAG-5') and PdhRhis- (3'-GAATTCCTAGTGGTGGTGGTGGT GATTCTTTCGTTGCTCCAG-5'). The latter encodes a C-terminal Penta-His-tag fused to the pdhR gene. Genomic DNA of the Escherichia coli K-12 derivative LJ110  was used as template. The 797 bp PCR product was purified with DNA Purification System (Promega), ligated into the pGEM®-T vector (Promega) and sequenced (Scientific Research and Development GmbH). Via a 5' Pst I restriction site provided by the primer PdhR+ and a 3' Pst I restriction site provided by the pGEM®-T vector, the pdhR-his gene was cloned into the expression plasmid pTM30  yielding pTM30PdhRhis.
Purification of His-tagged PdhR protein
His-tagged PdhR was overexpressed in E. coli JM109  using the expression plasmid pTM30PdhRhis and purified using affinity chromatography as described previously . Except that, for purification, frozen cells were resuspended in lysis buffer (50 mM Tris-HCl, pH 8.0 at 4°C, 100 mM NaCl) with 0.25 mM AEBSF and 0.25 mg/ml lysozyme. Other than Ni(II)-NTA agarose suspension was used, the supernatant was loaded onto a HisTrapTMFF column (GE Healthcare) and purified with the ÄKTA FPLC (GE Healthcare). The column was subsequently washed with 10 ml buffer N (20 mM Tris-HCl, pH 8.0 at 4°C, 0.1 mM EDTA, 500 mM NaCl, 5 mM 2-mercaptoethanol, and 5% glycerol) containing 5 mM imidazole, and with 20 ml buffer N containing 20 mM imidazole. The protein was eluted with buffer N containing 150 mM imidazole and the fraction containing his-tagged PdhR was dialyzed against storage buffer (50 mM Tris-HCl, pH 7.6 at 4°C, 200 mM KCl, 10 mM MgCl2, 0.1 mM EDTA, 1 mM DTT, and 50% glycerol). The protein concentration was determined with the Qubit fluorometer (Invitrogen) and the purity was checked by SDS-PAGE, western blot analysis and silverstain.
Gel shift assay
DNA probes were either generated by annealing equimolar amounts of fluorescence labeled primers (Thermo Fisher Scientific) of the PdhR-binding site (5'DY682-GCCGAAGTCAATTGGTCTTAC CAATTTCATGTCTGTG-3'and 5'DY782-CACAGACATGAAATTGGTAAGACCAATTGACTT CGGC-3') or the Mlc-binding site (5'DY782-TTGGCAAATTATTTTACTCTGTGTAATAAATAAA GGGCG-3' and 5'DY682-CGCCCTTTATTTATTACACAGAGTAAAATAATTCAGTGCCAA-3'). The promotor region (507 bp from the initiation codon) of lipA was amplified by PCR with fluorescence labeled primers (5'DY682-ACTATCGACAACGCTGCGCATG-3' and 5'DY782-TAGCGTGCGTGTTCCAGTT GCG-3'). The PCR product was purified with the DNA Purification System (Promega). The gel shift assays were performed as described previously  except that 0.1 pmol labeled DNA probe was added to the binding reaction. The PCR product of the lipA promotor region was used in a dilution of 0.025 pmol per reaction. The binding buffer and conditions were used as in , but 0.1 mg/ml buffer of heterologous herring sperma DNA was added. After addition of 5 μ l 50% glycerol to the binding reaction, the sample was loaded onto a 6% polyacrylamide gel. After gel electrophoresis the labeled DNA was detected by the Odyssey Scanner (Licor).
We thank the Gardner Lab of the Berkeley University and especially Boris Hayete for providing us their implementation of the spline-based mutual information estimator as well as helpful insights into their work. Furthermore we thank Arvind Rao for helpful comments on the concept of directed information. This study was supported by the German Federal Ministry of Education and Research (BMBF, grants FKZ 0315285A, FKZ 0315285E and FKZ0315285C) and the University of Jena (initiative "Gene-regulatory networks").
- Hecker M, Lambeck S, Toepfer S, van Someren E, Guthke R: Gene regulatory network inference: data integration in dynamic models-a review. Biosystems. 2009, 96: 86-103. 10.1016/j.biosystems.2008.12.004View ArticlePubMedGoogle Scholar
- Butte AJ, Kohane IS: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocomput. 2000, 418-429.Google Scholar
- Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A: Reverse engineering of regulatory networks in human B cells. Nat Genet. 2005, 37 (4): 382-390. 10.1038/ng1532View ArticlePubMedGoogle Scholar
- Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera RD, Califano A: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006, 7 (Suppl 1): S7- 10.1186/1471-2105-7-S1-S7PubMed CentralView ArticlePubMedGoogle Scholar
- Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007, 5: e8- 10.1371/journal.pbio.0050008PubMed CentralView ArticlePubMedGoogle Scholar
- Meyer PE, Kontos K, Lafitte F, Bontempi G: Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol. 2007, 79879-Google Scholar
- Massey J: Causality, feedback and directed information. 1990Google Scholar
- Rao A, Hero AO, States DJ, Engel JD: Using directed information to build biologically relevant influence networks. Comput Syst Bioinformatics Conf. 2007, 6: 145-156.View ArticlePubMedGoogle Scholar
- Faith JJ, Driscoll ME, Fusaro VA, Cosgrove EJ, Hayete B, Juhn FS, Schneider SJ, Gardner TS: Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res. 2008, D866-D870. 36 DatabaseGoogle Scholar
- Gama-Castro S, Jimínez-Jacinto V, Peralta-Gil M, Santos-Zavaleta A, naloza Spinola MIP, Contreras-Moreira B, Segura-Salazar J, niz Rascado LM, Martínez-Flores I, Salgado H, Bonavides-Martínez C, Abreu-Goodger C, Rodríguez-Penagos C, Miranda-Ríos J, Morett E, Merino E, Huerta AM, no Quintanilla LT, Collado-Vides J: RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res. 2008, D120-D124. 36 DatabaseGoogle Scholar
- Bembom O, Keles S, van der Laan MJ: Supervised detection of conserved motifs in DNA sequences with cosmo. Stat Appl Genet Mol Biol. 2007, 6: Article 8-Google Scholar
- Henestrosa ARFD, Ogi T, Aoyagi S, Chafin D, Hayes JJ, Ohmori H, Woodgate R: Identification of additional genes belonging to the LexA regulon in Escherichia coli. Mol Microbiol. 2000, 35 (6): 1560-1572. 10.1046/j.1365-2958.2000.01826.xView ArticleGoogle Scholar
- McKenzie GJ, Magner DB, Lee PL, Rosenberg SM: The dinB operon and spontaneous mutation in Escherichia coli. J Bacteriol. 2003, 185 (13): 3972-3977. 10.1128/JB.185.13.3972-3977.2003PubMed CentralView ArticlePubMedGoogle Scholar
- Cho BK, Knight EM, Barrett CL, Palsson BØ: Genome-wide analysis of Fis binding in Escherichia coli indicates a causative role for A-/AT-tracts. Genome Res. 2008, 18 (6): 900-910. 10.1101/gr.070276.107PubMed CentralView ArticlePubMedGoogle Scholar
- Sedgwick SG, Goodwin PA: Interspecies regulation of the SOS response by the E. coli lexA+ gene. Mutat Res. 1985, 145 (3): 103-106.PubMedGoogle Scholar
- Wade JT, Reppas NB, Church GM, Struhl K: Genomic analysis of LexA binding reveals the permissive nature of the Escherichia coli genome and identifies unconventional target sites. Genes Dev. 2005, 19 (21): 2619-2630. 10.1101/gad.1355605PubMed CentralView ArticlePubMedGoogle Scholar
- Ogasawara H, Ishida Y, Yamada K, Yamamoto K, Ishihama A: PdhR (pyruvate dehydrogenase complex regulator) controls the respiratory electron transport system in Escherichia coli. J Bacteriol. 2007, 189 (15): 5534-5541. 10.1128/JB.00229-07PubMed CentralView ArticlePubMedGoogle Scholar
- Herbert AA, Guest JR: Lipoic acid content of Escherichia coli and other microorganisms. Arch Microbiol. 1975, 106 (3): 259-266. 10.1007/BF00446532View ArticlePubMedGoogle Scholar
- Kredich NM: The molecular basis for positive regulation of cys promoters in Salmonella typhimurium and Escherichia coli. Mol Microbiol. 1992, 6 (19): 2747-2753. 10.1111/j.1365-2958.1992.tb01453.xView ArticlePubMedGoogle Scholar
- Murray EL, Conway T: Multiple regulators control expression of the Entner-Doudoroff aldolase (Eda) of Escherichia coli. J Bacteriol. 2005, 187 (3): 991-1000. 10.1128/JB.187.3.991-1000.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Schneider R, Lurz R, Lüder G, Tolksdorf C, Travers A, Muskhelishvili G: An architectural role of the Escherichia coli chromatin protein FIS in organising DNA. Nucleic Acids Res. 2001, 29 (24): 5107-5114. 10.1093/nar/29.24.5107PubMed CentralView ArticlePubMedGoogle Scholar
- Bradley MD, Beach MB, de Koning APJ, Pratt TS, Osuna R: Effects of Fis on Escherichia coli gene expression during different growth stages. Microbiology. 2007, 153 (Pt 9): 2922-2940. 10.1099/mic.0.2007/008565-0View ArticlePubMedGoogle Scholar
- Moon YI, Rajagopalan B, Lall U: Estimation of mutual information using kernel density estimators. Physical Review E. 1995, 52: 2318-2321. 10.1103/PhysRevE.52.2318.View ArticleGoogle Scholar
- Daub CO, Steuer R, Selbig J, Kloska S: Estimating mutual information using B-spline functions-an improved similarity measure for analysing gene expression data. BMC Bioinformatics. 2004, 5: 118- 10.1186/1471-2105-5-118PubMed CentralView ArticlePubMedGoogle Scholar
- Stormo GD: DNA binding sites: representation and discovery. Bioinformatics. 2000, 16: 16-23. 10.1093/bioinformatics/16.1.16View ArticlePubMedGoogle Scholar
- van Helden J: Regulatory sequence analysis tools. Nucleic Acids Res. 2003, 31 (13): 3593-3596. 10.1093/nar/gkg567PubMed CentralView ArticlePubMedGoogle Scholar
- McCue L, Thompson W, Carmack C, Ryan MP, Liu JS, Derbyshire V, Lawrence CE: Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucleic Acids Res. 2001, 29 (3): 774-782. 10.1093/nar/29.3.774PubMed CentralView ArticlePubMedGoogle Scholar
- McCue LA, Thompson W, Carmack CS, Lawrence CE: Factors influencing the identification of transcription factor binding sites by cross-species comparison. Genome Res. 2002, 12 (10): 1523-1532. 10.1101/gr.323602PubMed CentralView ArticlePubMedGoogle Scholar
- Forsythe GEGE, Malcolm MA, Moler CB: Computer Methods for Mathematical Computations. 1977, Prentice-Hall series in automatic computationGoogle Scholar
- Keseler IM, Collado-Vides J, Gama-Castro S, Ingraham J, Paley S, Paulsen IT, Peralta-Gil M, Karp PD: EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res. 2005, D334-D337. 33 DatabaseGoogle Scholar
- Zeppenfeld T, Larisch C, Lengeler JW, Jahreis K: Glucose transporter mutants of Escherichia coli K-12 with changes in substrate recognition of IICB(Glc) and induction behavior of the ptsG gene. J Bacteriol. 2000, 182 (16): 4443-4452. 10.1128/JB.182.16.4443-4452.2000PubMed CentralView ArticlePubMedGoogle Scholar
- Morrison TB, Parkinson JS: Liberation of an interaction domain from the phosphotransfer region of CheA, a signaling kinase of Escherichia coli. Proc Natl Acad Sci USA. 1994, 91 (12): 5485-5489. 10.1073/pnas.91.12.5485PubMed CentralView ArticlePubMedGoogle Scholar
- Yanisch-Perron C, Vieira J, Messing J: Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. Gene. 1985, 33: 103-119. 10.1016/0378-1119(85)90120-9View ArticlePubMedGoogle Scholar
- Yamamoto K, Ogasawara H, Fujita N, Utsumi R, Ishihama A: Novel mode of transcription regulation of divergently overlapping promoters by PhoP, the regulator of two-component system sensing external magnesium availability. Mol Microbiol. 2002, 45 (2): 423-438. 10.1046/j.1365-2958.2002.03017.xView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.