Exploring the metabolic network of the epidemic pathogen Burkholderia cenocepacia J2315 via genome-scale reconstruction

Background Burkholderia cenocepacia is a threatening nosocomial epidemic pathogen in patients with cystic fibrosis (CF) or a compromised immune system. Its high level of antibiotic resistance is an increasing concern in treatments against its infection. Strain B. cenocepacia J2315 is the most infectious isolate from CF patients. There is a strong demand to reconstruct a genome-scale metabolic network of B. cenocepacia J2315 to systematically analyze its metabolic capabilities and its virulence traits, and to search for potential clinical therapy targets. Results We reconstructed the genome-scale metabolic network of B. cenocepacia J2315. An iterative reconstruction process led to the establishment of a robust model, iKF1028, which accounts for 1,028 genes, 859 internal reactions, and 834 metabolites. The model iKF1028 captures important metabolic capabilities of B. cenocepacia J2315 with a particular focus on the biosyntheses of key metabolic virulence factors to assist in understanding the mechanism of disease infection and identifying potential drug targets. The model was tested through BIOLOG assays. Based on the model, the genome annotation of B. cenocepacia J2315 was refined and 24 genes were properly re-annotated. Gene and enzyme essentiality were analyzed to provide further insights into the genome function and architecture. A total of 45 essential enzymes were identified as potential therapeutic targets. Conclusions As the first genome-scale metabolic network of B. cenocepacia J2315, iKF1028 allows a systematic study of the metabolic properties of B. cenocepacia and its key metabolic virulence factors affecting the CF community. The model can be used as a discovery tool to design novel drugs against diseases caused by this notorious pathogen.


Background
Burkholderia cenocepacia is a Gram-negative opportunistic pathogen and formerly Genomovar III of Burkholderia cepacia complex (Bcc). The Bcc comprises at least 17 taxonomically related species [1][2][3], which have developed diverse niches from the natural environment [4] and humans as they have emerged as pathogens in patients with cystic fibrosis (CF), chronic granulomatous disease, and in immunocompromised individuals [5]. B. cenocepacia is the dominant Bcc species in patients with CF, accounting for between 50% and 80% of the infection cases [5]. It also causes many instances of non-CF clinical infections, such as for cancer patients [6,7]. As a representative isolate for the spread of an epidemic CF strain, B. cenocepacia J2315 belongs to a clonal lineage known as ET12, which is of increased transmissibility and dominates fatal infections among CF patients in the United Kingdom and Canada [8][9][10][11][12]. B. cenocepacia J2315 is notorious for its high resistance to the majority of clinically useful antimicrobial agents [6,13], including antimicrobial peptides [14,15]. Yet the mechanisms of host infection and drug resistance remain mostly unknown.
The genome of B. cenocepacia J2315 has been sequenced and recently annotated [13]. It is one of the largest Gram-negative genomes consisting of three circular chromosomes with 3.8, 3.2 and 0.8 million base pairs (Mb) respectively and a plasmid. Its complex genome encodes a broad range of metabolic capabilities, and numerous virulence and drug resistance functions that allow it to survive under a variety of conditions and invade immunocompromised individuals. It is vital to develop a systems-level metabolic model for this opportunistic human pathogen to explore and gain insights into its versatile metabolic capability and disease-causing mechanism, and eventually aid in finding potential clinical therapeutic targets. The genome-scale metabolic reconstruction enables integration of genomic information with metabolic activities observed in phenotypic experiments and other "omics" measurements to elicit hidden biological knowledge that would have been otherwise difficult to obtain.
In this study, we presented the manually curated genome-scale metabolic network of B. cenocepacia J2315, named as iKF1028, which accounts for the major metabolic pathways for the synthesis of each component of biomass and for the degradation of common biologically important carbon sources. Syntheses pathways for key virulence factors highly associated with metabolism were particularly emphasized and reconstructed. The in silico model was validated by performing BIOLOG substrate utilization assays, which can test the ability of a microorganism to oxidize various substrates simultaneously [16]. Model-driven analysis and discoveries, including refinement of gene annotation, and gene and enzyme essentiality, were carried out to define the architecture of the genome-wide metabolic and transport network and assist the identification of potential drug targets. Model iKF1028 provides researchers a framework to explore and understand the global metabolism of B. cenocepacia J2315 and its key metabolic virulence factors affecting CF patients upon infection. It allows a broad spectrum of basic and practical applications, especially the application for drug design which may open new doors for anti-infection strategies.

Results and discussion
Characteristics of the genome-scale metabolic network of B. cenocepacia J2315 The genome-scale reconstructed metabolic model of B. cenocepacia J2315, referred by the conventional naming rules [17] as iKF1028, consists of 859 internal reactions (including transport) and 834 metabolites. The reconstruction accounts for 1,028 genes, covering 14.4% of the 7,116 protein coding genes identified from whole genome sequencing (http://www.ncbi.nlm.nih.gov/genome?term=burkholderia%20cenocepacia%20J2315). The model iKF1028 includes all major pathways required for cell growth and the degradation of common biologically important carbon sources of B. cenocepacia. Apart from these central metabolic pathways, model iKF1028 also includes pathways associated with key metabolic virulence factors, which provides insights into how the system-level metabolic properties affect pathogenicity. For an overview, the properties of the J2315 genome and the reconstructed model iKF1028 were summarized in Table 1. Genome-scale metabolic models have been successfully used to study many pathogenic bacteria, including Staphylococcus aureus [18][19][20], Acinetobacter baumannii [21], Mycobacterium tuberculosis [22], Salmonella typhimurium [23], and Pseudomonas aeruginosa [24]. A basic comparison between the model iKF1028 and the above five recently published metabolic reconstructions is also illustrated in Table 1. Schematic representation of the metabolic network of B. cenocepacia J2315 with key metabolic virulence factors is shown in Figure 1. Figure 2 enumerates the metabolic pathways included in iKF1028 and, for each pathway, the number of reactions with assigned and non-assigned genes. The high ratio of gene-associated reactions shows that the reconstructed metabolic model of B. cenocepacia J2315 is reliable. [Additional file 1 for iKF1028 and Additional file 2 is in SBML format]

Metabolic virulence factors in model iKF1028
The success of B. cenocepacia as a pathogen originates from the ability of its large genome to encode numerous virulence mechanisms [13], including quorum sensing (QS) [25][26][27][28][29][30], siderophores-based iron uptake systems [31][32][33], cable pili and adhesion [34][35][36], motility [37,38], hemolysin [39], ZmpA and ZmpB proteases [40][41][42], phospholipases [43], secretion systems [44][45][46], lipopolysaccharides (LPS)  [15,[47][48][49], and extracellular capsule [50]. Syntheses of the key metabolic virulence factors of these virulent mechanisms, namely QS, LPS and rhamnolipids, were incorporated and analyzed in iKF1028. Table 2 lists the virulence-associated pathways and the required proteins and precursors for syntheses of virulence factors in each pathway. The LPS produced by B. cenocepacia J2315 has an important role in both disease aetiology and antibiotic resistance [51,52]. LPS usually consists of three components: lipid A, core oligosaccharide, and O antigen. Although there were some studies on characterizing the features of LPS in B. cenocepacia, all these studies focused on a certain part/component of LPS. So far, there is no systematic elucidation of the LPS structure and composition specifically for B. cenocepacia strain J2315, nor any global analysis on its biosynthesis process of the LPS. In this study, we depicted the detailed features of the complete LPS structure in B. cenocepacia J2315 by integrating all available reports on LPS. We also reconstructed the LPS-synthesis pathways supplemented with all necessary proteins involved and major metabolic precursors, as illustrated in Figure 3. According to our study, in J2315, each of the three components has a very unique feature. The lipid A portion is modified by an additional Ara4N residue [49,53], which had been shown to reduce the binding of cationic antibiotics and was proposed as a potential drug target [54]. The inner core oligosaccharide contains an unusual KDO-KO-Ara4N residue instead of the typical KDO-KDO residue [15,51,55]. The outer core comprises various polysaccharides including L-glycero-D-mannoheptose, glucose, galactose, quinovosamine, and rhamnose [51]. The O-antigen portion of LPS in J2315 was interrupted by an insertion element in BCAL3125 [47,56]. These differences might indicate the reason why strain J2315 is of remarkably distinct activity.
B. cenocepacia strains possess multiple quorum sensing systems, which regulate the expression of versatile  virulence determinants, such as biofilm formation and motility. Strain J2315 owns the ability to synthesize and recognize three types of chemical signals used for cellto-cell communication: N-acylhomoserine lactones (AHLs), 4-quinolones (4Qs), and the DSF-like molecule cis-2-dodecenoic acid (BDSF). Two AHLs-based QS systems have been found in J2315, namely CciIR and CepIR [25,57], which can both produce N-hexanoyl-Lhomoserine lactone (C6-HSL) and N-octanoyl-L-homoserine lactone (C8-HSL) signals using acyl side chain (Hexanoyl-ACP and Octanoyl-ACP, respectively) and Sadenosyl-methionine (SAM) as precursors [58]. The CepIR system is conserved in all species of the Bcc. The CciIR system is encoded within a pathogenicity island, designated as the B. cenocepacia island (cci), which was the first time that cell-signalling genes were found on a genomic island [59]. The 4Qs-based signal, the 2-heptyl-4-quinolone (HHQ), is produced by B. cenocepacia strains [26]. HHQ is the precursor of 2-heptyl-3hydroxy-4-quinolone (PQS) [60] and its synthesis requires four proteins: PqsA, PqsB, PqsC, and PqsD. It had been reported that the exported HHQ from B. cenocepacia can be recognized by Pseudomonas aeruginosa within which HHQ is converted into PQS which is one of the QS signals for P. aeruginosa [26], highlighting the possibility of inter-species communication during the CF co-infection caused by P. aeruginosa and B. cenocepacia. BDSF is a newly discovered signal molecule produced by B. cenocepacia [28]. The synthesis of BDSF requires the gene BCAM0581 [29].
The synthesis pathway of rhamnolipids was also reconstructed in iKF1028. Although there has not been any report demonstrating that B. cenocepacia can produce rhamnolipid, Dubeau et al demonstrated that Burkholderia thailandensis has the orthologs of rhlA, rhlB, and rhlC, which are responsible for the biosynthesis of rhamnolipids in P. aeruginosa [61]. By protein similarity search against the UniProt database, proteins coded by genes BCAM2340, BCAM2338, and BCAM2336 in B. cenocepacia J2315 were identified as highly similar in sequence to rhlA, rhlB, and rhlC in both B. thailandensis (with BLAST E value of 1E-121, 1E-173, and 1E-108, respectively) and P. aeruginosa (with BLAST E value of 3E-60, 7E-98, and 1E-67, respectively). This facilitates us to hypothesize that B. cenocepacia can potentially generate rhamnolipids. Further experimental investigations are needed.
Model validation and gap-filling using phenotype data BIOLOG substrates utilization assays for B. cenocepacia J2315 were performed in triplicates in order to validate and refine the model. In silico growth on various  substrates was simulated by setting each of them as sole carbon source and its uptake rate to 10 mmol/g cell /h under aerobic conditions based on M9 minimal medium. The simulation was performed on the ToBiN platform by flux balance analysis, as described in Methods.
Of the 95 carbon sources tested, 40 could be directly compared with the in silico model of B. cenocepacia J2315, iKF1028. Preliminary disagreement between BIO-LOG assays and in silico predictions were probably due to metabolic gaps, improper gene annotations and unacquainted transporters. These discrepancies were checked through gap analysis and literature mining.
After continuous gap-filling and network refinement, the overall prediction accuracy was improved to 87.5%, a value that supported iKF1028 as being a proper reconstruction of the B. cenocepacia J2315 core metabolism (comparison results are showed in Table 3).
Of the remaining 55 carbon sources tested, 14 were indirectly compared with the model due to the missing knowledge of whether the transport mechanisms of these compounds exist in J2315 or not. Initially, all those 14 carbon sources showed a no-growth phenotype both in silico and in the BIOLOG assays. Then we made the assumption that each of these carbon sources could be transported into the cell (i.e. to function as intracellular compounds) and re-tested whether the in silico model can grow on each of them. The results showed that 11 of the 14 carbon sources enabled iKF1028 to grow after applying the above assumption. This supports the hypothesis that J2315 lacks of transporters for all those 11 carbon sources, even though their catabolic pathways are complete. For the rest 3 carbon sources, the agreement between the in silico results and BIOLOG assays remained.
As the catabolism of the remaining 41 carbon sources out of 55 has not been well studied and information regarding their role in the cell could not be found in public resources, these 41 carbon sources cannot be analyzed in our model. (Complete comparison results with BIO-LOG assays are supplied in the Additional file 3).

Model-driven refinement of genome annotation
The reconstruction of metabolic network allowed the identification and refinement of improperly annotated genes of B. cenocepacia J2315 from the public biological databases. Careful effort was made in this work to rectify the current genome annotation based on metabolic gap analysis, BLAST searches, BIOLOG substrate utilization assays, and literature mining. The full list of refinement of genome annotation derived from the genome-scale metabolic reconstruction is shown in Table 4.
The first type of refinement was to re-annotate genes in iKF1028 -based on literature evidence and BLAST searches -that have been improperly annotated. An example is the gene BCAL1281, which was annotated in both the Burkholderia Genome Database and KEGG as a "hypothetical protein", but that was now reassigned coding for an "ornithine N-acyltransferase". It was reported that the outer membrane of "B. cepacia" [62] possesses unusual polar lipids, ornithine amide lipids (OL) [63,64]. In addition, the protein OlsB is required for the first step of OL biosynthesis [65]. By BLAST searches of OlsB against the UNIPROT database, the gene BCAL1281 of B. cenocepacia J2315 was identified with high similarity.
Another type of annotation refinement was based on gap analyses, which pinpointed reactions for which the gene products involved were missing. For instance, we identified a missing reaction that should be catalyzed by IspE and that takes part in the biosynthesis of polyprenyl-PP, a necessary precursor of the ubiquinone biosynthesis. The IspE encoding genes for other strains of B. cenocepacia (AU1054, HI2424, MC0-3) could be identified in the Burkholderia Genome Database and KEGG. By querying GeneDB, we found that the genomic location from 872938 to 873820, named BCAL0802, was not assigned any function in the above two databases. A BLAST search of BCAL0802 against the UNI-PROT database revealed a perfect match with IspE from other B. cenocepacia strains. Consequently, BCAL0802 is annotated as a 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase and this missing reaction was supplemented into the model iKF1028. This example exemplifies how the reconstruction process can drive the reconciliation of isolated data from different biological databases.
BIOLOG substrate utilization assays have already been successfully used on the refinement of metabolic reconstructions [24]. It is an efficient approach for gap analysis and refinement of genome annotation. For example, in our study we can highlight the case for the substrate D-galactose, associated in BIOLOG assays with a growth phenotype. From that result we inferred that J2315 should contain a transport mechanism for D-galactose. By homology searches of MglA, MglB, and MglC, which are galactose-binding proteins conveying galactose into the cell, gene BCAL1431 (mglB), BCAL1432 (mglA), and BCAL1433 (mglC) were identified and they had been annotated as a putative sugar ABC transporter ATP-binding protein, a putative ribose ABC transport system, and a putative sugar transport system permease protein respectively. The ordering of mglB, mglA, and mglC is consistent with a) previous studies indicating that the mgl operon contains three genes and the genes are transcribed in the order of mglB, mglA, and mglC [66] and b) mglA and mglC being located downstream of mglB [67]. As a result, the annotations of BCAL1431-1433 were refined to account for the galactose transport. In total, 7 genes were reannotated based on BIOLOG assays.

Gene essentiality analysis
The term 'essential gene' means a gene for which knockout is lethal (i.e. no biomass yield) under certain conditions (e.g. glucose minimal medium) [68]. Identification of essential genes is helpful to understand the basic functions required for survival and it is an efficient way to discover novel targets for new antimicrobial therapies. Here in this study, iKF1028 was used as a framework to predict computationally identified essential genes in B. cenocepacia J2315 on both M9 minimal medium and synthetic CF medium (SCFM). About 19% (192) and 15% (154) of the 1,028 metabolic genes included in iKF1028 were predicted to be essential on M9 and SCFM media, respectively. There are more genes predicted as essentials on M9 than on SCFM, which indicates the influence of the living environment on the bacterium. 147 overlapping predictions were on both media. These essential genes are unequally located on the three chromosomes and most of the essential genes are located on chromosome 1 (Figure 4a). This result agrees with known features of the J2315 genome: chromosome 1 contains a higher proportion of coding sequence (CDS) involved in central metabolism and other house-keeping functions, whereas chromosomes 2 and 3 contain a greater proportion of CDS encoding accessory functions [13]. To assess the predictive potential of the model, we compared the in silico essential genes predicted on SCFM with experimental essentiality data for P. aeruginosa PAO1 and P. aeruginosa PA14 [69,70] since there is no experimental gene essentiality data available for B. cenocepacia. As both P. aeruginosa and B. cenocepacia are CF pathogens, and B. cenocepacia was historically classified under the genus Pseudomonas [71], it is possible that partial similarity exists between them. SCFM was chosen due to its similarity to the nutritional composition of sputum from CF patients. The common set of the essential genes from two P. aeruginosa strains was chosen for comparison in order to reduce the effect of strain-dependent variation. Totally, there are 294 in silico essential metabolic and non-metabolic genes of B. cenocepacia J2315 with high similarity to the common set of P. aeruginosa by BLAST searches, of which 91 in vivo essential genes are present in iKF1028. Genes in the in vivo essential set but not in the iKF1028 were assumed to be involved either in non-metabolic functions or in accessory functions of metabolism. A total of 55% (50) of the in silico predicted essential genes agreed with the in vivo essential genes based on gene homology with P. aeruginosa (Figure 4b, refer to Additional file 4 for detailed information about essential genes).
Based on gene essentiality analysis, the genome-scale metabolic model of B. cenocepacia was further refined. For example, BCAL0660 and BCAL3421, which were homologous genes encoding protein AccC according to their annotation, were originally both included in the model with the gene-protein-reaction (GPR) relationship of "BCAL0660 or BCAL3421". Through in silico gene deletion study, BCAL3421 is identified as non-essential, which is inconsistent with the in vivo essential gene results. Such discrepancy was subject to further analysis. AccA, AccB, AccC, and AccD were four subunits of Acetyl-CoA Carboxylase (ACC) catalyzing the first step of fatty acid biosynthesis [72]. The gene accB, of which the locus in J2315 is BCAL3420, is frequently adjacent to the gene accC and both genes are cotranscribed and form an operon together [73,74]. Furthermore, BCAL3420 and BCAL3421 shows greater than 2-fold expression for J2315 under CF conditions versus soil conditions, yet BCAL0660 shows an opposite result [75].
Taken together, BCAL0660 was excluded from model iKF1028. Further studies are necessary to validate the function of BCAL0660.

Identification of essential enzymes and potential drug targets
Essential enzyme/protein refers to a gene product (catalyzing the relevant reactions) for which individual deletion (i.e. imposing the fluxes through these reactions to zero) is lethal under certain conditions. Through FBA using iKF1028,we could obtain a collection of essential enzymes (protein), based on which 45 essential enzymes were identified as potential drug targets and supported by experimental evidences from literatures. There are 39 of them which were also predicted as drug targets for P. aeruginosa PAO1 [76]. All the 39 targets are nonhomologous to human protein sequences and thus could serve as potential candidate antibiotic drug targets for CF patients infected by both B. cenocepacia J2315 and P. aeruginosa PAO1. Among 39 targets, there are 9 targets, namely AccA, AccB, AccC, AccD, MurA, FolP, PhoA, RibE, and RibH, which have approved drugs in the DrugBank database [77].
The other 6 potential targets, namely ArnT, ArnB, ArnC, ArnA1, ArnA2, and Ugd are unique in B. cenocepacia J2315. ArnT, ArnB, ArnC, ArnA1, and ArnA2 are necessary proteins required for the synthesis of Ara4N, which is an additional moiety of LPS specially presented in B. cenocepacia J2315. Ara4N is essential for the viability of B. cenocepacia J2315 and significantly contributes to high resistance to antimicrobial peptides (AMPs) [54]. AMPs have been proposed as agents for treating CF infections [78,79]. It had also been demonstrated that arnC transposon mutant was survival-defective and attenuated in infected rats [48]. The UDP-glucose dehydrogenase (Ugd), which catalyzes the conversion of UDP-glucose to UDP-glucuronic acid and is the initial step in the synthesis of UDP-Ara4N, is also necessary for the viability of B. cenocepacia and its resistance to polymyxin B [80]. These targets are potentially useful for designing strategies against B. cenocepacia J2315. Further studies are necessary to test their applicability. An overview of the 45 proposed targets is given in Table 5.  annotation refinement and identification of gene and enzyme essentiality analyses are helpful to understand the genome and discover promising novel drug targets. Through careful investigation, we proposed 45 enzymes that catalyze reactions predicted to be essential for growth with priority to be considered as drug targets. The model will keep being further validated and improved with experimentally determined biomass composition, large-scale gene deletion experimental data, proteome, and metabolome data, as they become available for B. cenocepacia. The model herein developed provides a valuable tool to explore the metabolic space of B. cenocepacia, to describe its metabolic wiring under a range of conditions, to pinpoint possible targets and to generate testable hypotheses. Taken together, our study underlined the value of the model iKF1028 as a framework to systematically study the metabolic capabilities of B. cenocepacia and its metabolic virulence factors affecting the CF community.

Reconstruction of the metabolic network
The reconstruction process for B. cenocepacia J2315 is illustrated in Figure 5. The process followed the procedure described previously [81]. The reconstruction was carried out on ToBiN (Toolbox for Biochemical Networks, http://www.lifewizz.com), which was first mentioned in the paper [82]. ToBiN is a modular platform for metabolic modelling and the structural analysis of networks. It consists of a collection of open-source computational tools. Sets of reactions can be uploaded in the platform via a web interface, merged with already existing sets, and the resulting stoichiometric matrix is then processed by the server as a FBA problem. The linear solver that ToBiN used is the Clp (Coin-or linear programming), an open-source linear programming solver written in C++ and is part of the COIN-OR (Computational Infrastructure for Operations Research) project (http://www.coin-or.org). The platform works in a similar way as the COBRA toolbox with the main difference that, by being web-based, it permits users to adopt a more efficient and collaborative workflow. An initial draft reconstruction was derived from the annotated genome of Burkholderia cenocepacia J2315 available at the Burkholderia Genome Database (http:// www.burkholderia.com). To link annotated genes to proteins and proteins to reactions, biological databases such as KEGG, GeneDB, UniProt, BRENDA, Transport Classification Database (TCDB), and TransportDB were used [83][84][85][86][87][88]. Manual curations were performed to establish gene-protein-reaction (GPR) associations, which connect genetic data to reactions in the metabolic network and allow for subsequent exploration of metabolic phenotypes using genetic perturbations.
After the initial reconstruction was generated, gaps in metabolic pathways necessary to produce biomass components and key virulence factors were filled by cautious literature mining, BIOLOG substrates utilization assays, and BLAST searches on homology and protein sequence similarity analyses [89,90]. The genome annotation was refined as consequence of the gap-filling and model extension process.
Flux Balance Analysis (FBA) was carried out throughout this study to explore the metabolic capabilities of iKF1028 under various environments. In addition to minimal medium, synthetic cystic fibrosis medium (SCFM) representing the physical living environment during CF infection was simulated in silico to investigate the metabolic flux distribution in a CF-like condition.

Biomass composition
The biomass composition in the genome-scale metabolic model of B. cenocepacia J2315 was adapted by selecting the well-studied biomass composition of E. coli as a template [91], since there's no experimental data available about the biomass composition of B. cenocepacia. However, the amount of metabolic precursors to formulate the cellular component was specific to B. cenocepacia according to previous study [20]. Moreover, the relative fatty acid composition of the lipids required for growth was based on data specific to B. cenocepacia [63,64,92,93] and listed in Table 6. Further details are provided in the supplemental material [Additional file 5].

Flux balance analysis
Flux balance analysis (FBA) is an algorithm based on linear programming (LP) and on the assumption that the represented metabolic network is in steady-state (i.e. all the intracellular metabolite concentrations are constant). Being a LP problem, FBA also requires the selection of an objective function and of whether the value for that same function should be maximized or minimized. FBA is usually used to compute the optimal growth yield (the maximized objective function) based on the assumption that the evolutionary fitness of the organism depends on growth alone and, consequently, the implicit regulatory mechanism are organized to permit the theoretical maximal growth. If the system of equations (stoichiometric matrix which represents the metabolic network) is feasible, the algorithm generates an optimal flux distribution for that same network, taking into account the imposed thermodynamic constraints (reaction directionality) and limits on substrate uptake rates. The mathematical description is as follows: Where S is a stoichiometric matrix containing i rows representing metabolites and j columns representing reactions, v is a vector of all reaction fluxes, v min and v max are imposed lower and upper bounds on flux v j respectively, and c T is a vector of coefficients for each reaction that is to be maximized.

In silico media composition
Two different living environments were simulated in silico for strain J2315: M9 minimal medium [94], which contains PO 4 3-, SO 4 2-, NH 4 + , H + , Fe 2+ , K + , Mg 2+ , Na + , H 2 O, and thiamine, with glucose or other BIOLOG substrates as sole carbon source; and synthetic CF sputum medium (SCFM) [95] representing the nutrient conditions inside a host-cell during CF infection. Details of the simulated SCFM composition are provided in the supplemental material [Additional file 6].

BIOLOG assay
To validate the model and estimate the metabolic capabilities of strain J2315, BIOLOG assay was performed by using various carbon sources for strain cultivation [16]. The BIOLOG assay was carried out in triplicates using Biolog GN2 MicroPlates (Biolog, Inc.), which can test the ability of a microorganism to oxidize a panel of 95 different carbon sources simultaneously. The procedure for using the MicroPlates was according to the manufacturer's specification. The strain J2315 was obtained from DSMZ GmbH (DSMZ 16553, equivalent to LMG 16656 as which strain J2315 has been deposited in the BCCM/LMG Bacteria Collection). The strain was cultured overnight in CASO agar plate. Then the bacteria were swabbed from the plate surface and suspended in GN/GP inoculating fluid (Biolog, Inc.) and 150 μl of the suspension was transferred to each well of the GN2 MicroPlate. The MicroPlates were incubated at 30°C for 48 hours and were read by a microplate reader at 24 and 48 hours and analyzed with the Biolog Micro-Log3 4.20 software (Biolog, Inc.). A comparison between the BIOLOG results and in silico predictions is provided in the supplemental material. [Additional file 3] Gene and enzyme essentiality FBA can be used to interpret genetic modification, such as gene deletion and enzyme inhibition, and subsequently make comprehensive in silico predictions on gene and enzyme essentiality [96]. To assess the essentiality of a gene, its GPR is checked for a unique relation with the associated reaction(s). If the gene is necessary to the reaction, the reaction flux will be constrained to zero and a solution for the maximal growth yield is searched. The deleted gene is predicated to be essential if, as consequence of that added constraint, the value of the objective function (growth yield) changes to zero. The deletion of every gene accounted in the model was simulated for growth on minimal medium with glucose as sole carbon source, and on SCFM. Similarly, an enzyme is considered essential if, by constraining to Model extension for metabolic virulence factors Figure 5 The process for genome-scale metabolic reconstruction of B. cenocepacia J2315. The left side indicates resources used for reconstruction, and the right side indicates the reconstruction process. Initial reconstruction started from genome annotation and other biological databases. Gap-filling was a continuous step throughout the reconstruction by probing missing reactions in a pathway which causes in silico growth infeasible, and subsequently closing these gaps by referring to the biological databases, extensive literature mining, and comparison with BIOLOG substrate utilization assays [89,90]. This improved model was then extended by adding key metabolic virulence factors for B. cenocepacia from the literature. The process of model development and validation against experimental data was iteratively repeated until the genome-scale metabolic model was robust.