Designing synthetic networks in silico: a generalised evolutionary algorithm approach
 Robert W. Smith†^{1, 2}View ORCID ID profile,
 Bob van Sluijs†^{1} and
 Christian Fleck^{1}Email author
https://doi.org/10.1186/s1291801704999
© The Author(s) 2017
Received: 31 August 2017
Accepted: 13 November 2017
Published: 2 December 2017
Abstract
Background
Evolution has led to the development of biological networks that are shaped by environmental signals. Elucidating, understanding and then reconstructing important network motifs is one of the principal aims of Systems & Synthetic Biology. Consequently, previous research has focused on finding optimal network structures and reaction rates that respond to pulses or produce stable oscillations. In this work we present a generalised in silico evolutionary algorithm that simultaneously finds network structures and reaction rates (genotypes) that can satisfy multiple defined objectives (phenotypes).
Results
The key step to our approach is to translate a schema/binarybased description of biological networks into systems of ordinary differential equations (ODEs). The ODEs can then be solved numerically to provide dynamic information about an evolved networks functionality. Initially we benchmark algorithm performance by finding optimal networks that can recapitulate concentration timeseries data and perform parameter optimisation on oscillatory dynamics of the Repressilator. We go on to show the utility of our algorithm by finding new designs for robust synthetic oscillators, and by performing multiobjective optimisation to find a set of oscillators and feedforward loops that are optimal at balancing different system properties. In sum, our results not only confirm and build on previous observations but we also provide new designs of synthetic oscillators for experimental construction.
Conclusions
In this work we have presented and tested an evolutionary algorithm that can design a biological network to produce desired output. Given that previous designs of synthetic networks have been limited to subregions of network and parameterspace, the use of our evolutionary optimisation algorithm will enable Synthetic Biologists to construct new systems with the potential to display a wider range of complex responses.
Keywords
Background
Through the efforts of Systems and Synthetic Biologists, we have come to understand that responses of large, complex biological networks are mediated by a series of smaller, interconnected modules or motifs [1, 2]. In combination with synthetic implementation of these network motifs, mathematical modelling has aided the design and exploration of system properties [3–5]. A classic example of the ‘forward engineering’ approach is the Repressilator [3]. Elowitz & Leibler constructed the Repressilator motif in E. coli and, using a mathematical model, found that tuning the promoter strength and the protein lifetimes within their plasmid constructs enhanced the likelihood of obtaining oscillations [3]. These initial findings were extended by Tsai et al. who mathematically analysed different Represillatorbased network structures and parameter sets, finding that strong autoregulation of a single Repressilator component enhances the robustness of oscillations [6]. More recently, PotvinTrottier et al. have improved the performance of the Repressilator experimentally by reducing the effects of noise on the system [7]. Similar work has been performed with toggle switches and feedforward loops, providing us with a range of modular networks that can reliably produce different responses [4, 5].
Whilst the forward engineering approach has proven highly successful, the opposite challenge (‘reverse engineering’ a network design from a known desired response) is also of importance. Notably this would allow one to obtain novel network designs that may possess complex functionality. In terms of network design, there are two levels that need to be explored [8, 9]. The first level to be explored is the ‘network space’ where all possible network topologies exist. One method commonly used to optimise the topology of promoter circuits is Mixed Integer NonLinear Programming, a minimisation optimisation routine where parameters can be altered within certain ranges [10–12]. This method has been extended to optimise networks for multiple objectives, resulting in a Pareto front that allows one to observe the tradeoffs between different system constraints [10]. The second level is the ‘parameter space’ that contains the reaction rates for a given network topology. Importantly for synthetic network design, recent focus has been to find parameter sets that are robust to stochastic fluctuations thereby increasing the likelihood of successful experimental implementation [7, 13–15]. However, efficient means of executing and solving the reverse engineering problem have yet to be developed in a generalised manner for the synthetic biology community.
 1.
The ‘design space’ — consisting of both the network and parameter spaces — is efficiently explored to find the systems that are able to generate a desired phenotype.
 2.
One can track the effects of random system perturbations over the course of cellular evolution, much in the same way as observed during laboratory evolution experiments.
 3.
From the resulting evolutionary trajectories, one may be able to understand how simple motifs have naturally evolved into the much larger and complex networks we observe today in biological systems.
Note that the ‘design space’ is highly multidimensional and unbounded implying that a number of different systems can be found that yield the same phenotype. Thus, through the use of EAs, one is left with a number of locally optimal networks to test and validate experimentally. A review of different EAs has shown that they are able to find optimal networks from synthetic datasets [18]. The principles of EAs are in accordance with steps seen in evolution: starting from an initial population, the fitness of the individuals is assessed against some criteria, reproduction (either sexual or asexual) takes place to produce offspring networks that then undergo random mutations before a subsection of the individuals in the population are removed, undergoing the same cycle until some termination criteria are met [19]. Furthermore, these optimisation routines can be extended to incorporate multiple objectives whereby the resulting population of optimal individuals lies on the Pareto front. Along the Pareto front lies the set of ‘nondominated’ solutions, i.e. a solution cannot become improved for one performance criterium without becoming weaker in another [19, 20]. Thus, the Pareto front provides us with a suite of functionally unique systems that can be constructed experimentally with different properties.
Small modular networks, such as the Repressilator and feedforward loops, have proved to be good testcases for EAs [3, 4]. Francois & Hakim analysed the evolution of bistable switches and oscillating motifs using ordinary differential equations (ODEs) and asexual reproduction of networks [21]. They observed that posttranslational reactions are important for the generation of oscillating systems [21]. Using a different mathematical representation of reactions, Paladugu et al. also obtained a range of small networks able to produce desired responses whilst noting the computational difficulties of evolving large networks using such a mathematical framework [22]. Consequently, recent studies have concentrated on networks of limited size or of limited reactions to find motifs that robustly produce oscillations or respond to inputs [23–27]. Notably, it has been observed that robustness of an oscillator is related to cooperativity between transcription factors (as determined by Hill coefficients greater than 1) rather than system complexity [26]. By increasing the cooperativity between components the system becomes more nonlinear, which is an important design feature for oscillating networks [28]. Futhermore, Rodrigo et al. have tracked the evolution of small oscillating networks to propose how circadian clocks have adapted through evolution [29]. Thus, EAs allow us to understand a broad range of system dynamics and their occurrence in nature.
To alleviate the computational load of searching such a large ‘design space’, different methods of model description have been considered. First, Feng et al. used rulebased modelling whereby all the model information is contained in schema (strings and matrices) to evolve protein signalling networks [30, 31]. Second, Chang et al. have translated the schema into formatted ODEs to evolve synthetic oscillators of different size [32]. One advantage of containing system information in schema is that mutation and sexual reproduction (or crossover) of model pairs can be performed easily [19]. Finally, none of the above cases have taken into account multiobjective optimisation. Recently, Boada et al. have incorporated multiobjective EA steps into their parameter optimisation for a type1 incoherent feedforward loop [4, 33]. The resulting Pareto front highlighted how to tune system responses between different desired situations. Consequently, understanding the tradeoffs of certain system properties is an important aspect of systems design.
In this work, we introduce a generalised EA to solve the reverse engineering problem for synthetic biology. The EA combines schema representation, which allows for reproduction between networks, with a mathematicallytractable framework that can obtain optimal networks of any size for any desired response. This brings together the ideas of Chang et al. and Feng et al. with those of Francois & Hakim and Paladugu et al. [21, 22, 30, 32]. Using oscillatory systems and feedforward loops (see Supplementary Information) as test cases, we build on and generalise previous observations to design oscillating systems. Finally, we incorporate multiple objectives (as in Boada et al.) within our network optimisation [33]. To the best of our knowledge such multiobjective optimisation of network topologies (not just reaction rates) has not yet been utilised for biological systems.
Methods
Here, we shall describe how our EA is encoded and provide details as to the tests performed in this study. The most important concept within the EA is to translate binary strings/schema into reaction schemes for simulation by ODEs. This allows the EA to assess and optimise networks based on dynamic behaviours. In the Additional file 1, interested readers can find extra details about our EA implementation and possible extensions. The Python scripts are available at https://gitlab.com/wurssb/Evolutionary_Algorithm_Network_Design.
Network structure
where j∈ [ 1,N]. Thus, R and P are N×M matrices representing the number of reactants and products in a reaction, respectively, and k is a vector of N reaction rates. This is a similar style to that used by [30, 32] and is discussed in [19]. Here, we shall give an overview of how a network is constructed with examples.
Individual genes
Hence, as our example gene contains a (1,1,1) string then we know the mRNA expression is regulated by the transcription factor(s) of other nodes in the network. In our EA implementation, transcription factors can regulate mRNA production through three effective regulatory gates: competitive binding between monomeric transcription factors, cofactor regulation whereby multiple transcription factors are required for expression, or competitive binding of protein complexes. In this final scenario, complexes of activating (or inhibiting) transcription factors compete for a single binding region of the target promoter.
These dictionaries are maintained for the rest of the EA. The same idea is applied for all regions of the gene binary string and allows our EA to translate the evolving strings into functional modules.
Network description
Gene regulation
The EA maps the information contained in the binary genes and the adjacency matrix to a userdefined reaction library (Additional file 1: Figure S1). The reaction library is a list of potential reactions that are sequentially ordered to construct reaction schema (Additional file 1: Tables S2S6). Each of these reactions is given a unique name (Additional file 1: Figure S2). Ultimately, the ordering of preferred reactions influences the resulting dynamics of the system. To clarify this point more directly we present a simple example.
Example #1: gene interactions

Transcription

Translation

ProteinProtein Binding

Gene Activation
that, upon translation in ODEs (see below), would produce different dynamics to Eq. (4). Notably, in Eq. (5), the dimer sequesters the transcription factor rather than helping to produce the mRNA of G _{2}.
Mathematical formulation of reaction networks
is the flux vector as defined in [36]. The initial conditions are given by \(\vec {\text {X}_{0}}\) and can be set in a userdefined manner.
Evolutionary algorithm
Initialisation
To start the EA, an initial population of individual networks needs to be constructed. The user can define the size of these initial networks or leave this to be randomly chosen from a uniform distribution between the allowed minimum and maximum network size. The maximum network size, defined by the number of gene nodes in the network, can be specified by the user. The initial set of networks are then created randomly by constructing adjacency matrices (with at least 1 connection between nodes) and random schema describing each node. One could also decide that all initial networks are the same by prescribing specific adjacency matrices and node schema.
Determining reaction rates
Parameter rates for each reaction within the network can be predefined or are obtained from a userdefined probability distribution P(k _{ j }). If the parameters of the initial networks are known then the parameter can be set to provide this value. In the mutation step of the algorithm, parameter values can be reselected from the initial ‘global’ probability distribution, P(k _{ j }), or from a userdefined local probability distribution that restricts changes in reaction rates to a small region relative to the original value. Additionally, one can define a subset of parameters within the network that do not undergo mutation or, in the event that a new node or connection is created within the network, new parameters are selected from P(k _{ j }).
Selection

Proportional: the probability of selection is related to the relative fitness of the individual as compared to all other individuals within the population. Thus, the fittest networks are not always selected as parents.

Semiproportional: the fittest individual is selected as one parent, and the second parent is selected using a proportional method. This is set as the default option in our EA, as shall be justified in the “Results” section.

Elite: the two fittest individuals are selected for recombination. The use of elitist methods has been shown to increase the speed of convergence towards an acceptable solution [19].
Given the selection of two individuals, these parental networks go on to form child networks via recombination and mutation steps (if a child has a lower fitness than the parental networks then this child is rejected and replaced by a fitter child) that make up the next generation of individuals. Thus, over the generations, the population of networks is consistently increasing in fitness/match to the desired response.
Recombination
Given this distribution, the probability of creating a large child network each generation is lower than the probability of creating small or medium sized networks.
Example #2: recombination
Hence, the child matrix is highly reminiscent of A _{1} with a missing connection between genes 2 and 3.
Mutation
Example #2 continued: mutation
Scoring functions
To evaluate the phenotypic fitness of the individual networks within the EA, we need to compare simulations with a specific phenotype. In the main text we shall discuss the cases of matching concentration profiles, obtaining oscillations using both single and multiobjective criteria and how oscillating mechanisms can be checked for robustness to internal fluctuations. In the Supplementary Information we investigate feedforward loops. The scoring functions, Δ, for all cases are discussed in detail in the Supplementary Information.
In our EA, we are able to optimise networks both for single and multiple objectives. In the case of single objective optimisation, this implies that the characteristics of model simulations and their comparison to the desired criteria can be aggregated into a single value, \(\Delta = \sum _{i=1}^{n} \delta _{i}\), where δ _{ i } are independentlycalculated objective scores. In the case of multiple objectives, the comparison between the desired system function and simulations yields a vector of objective scores, \(\vec {\Delta } = \{\delta _{1},\delta _{2},...,\delta _{n}\}\). So that two parents can be selected for recombination, we need to reduce this vector of scores to a single number for each network that allows us to more easily compare the performance of different systems. For this purpose, we use a ranking system within our EA. Similar methods of multiple objective scoring have been used previously in evolutionary algorithms that are known to perform well and find results that are optimal for multiple criteria (i.e. the Pareto front) [19, 37].
Taking the L _{ n }norm of each column in Δ yields the vector \(\vec {\Delta } = (\\delta _{1\ast }\,\\delta _{2\ast }\,\ldots,\\delta _{N\ast }\)\). By ranking these scores (where 1 is the lowest score for an objective and M is the highest score) one obtains \(\Xi (\vec {\Delta })\), where min\((\Xi (\vec {\Delta }))\) is the optimal network.
with \(\vec {\Xi } = (\\xi _{1\ast }\,\\xi _{2\ast }\,...,\\xi _{N\ast }\)\) representing the L _{ n }norm of each column. The optimal network is then obtained from min\((\vec {\Xi })\). Notably, these two methods result in differing results — using \(\Xi (\vec {\Delta })\) leads to networks that perform well in one of the N objectives being favoured over those that are able to equally balance multiple objectives, obtained from min(\(\vec {\Xi }\)).
Settings to obtain presented results

The initial conditions for each component in each network, \(\vec {\text {X}_{0}}\), are set to 0.01.

Initial networks are either randomly generated or fixed to specific network topologies (e.g. Repressilator or feedforward loop).

Parameters are drawn globally from a uniform distribution$$ k_{j} \sim U\left(k_{j}^{L},k_{j}^{U}\right), $$(15)
where \(k_{j}^{L}\) and \(k_{j}^{U}\) are the lower and upper bounds of the parameter space set in Additional file 1: Table S7, respectively, and;

Local parameter mutations are found by randomly selecting a new parameter value with 10% deviation from the original value.

The probability of specific network mutations are set in Additional file 1: Table S8.

The reaction library available to each network in the EA is discussed in the Supplementary Information.

Networks are ranked and selected using min\((\vec {\Xi })\) using the L _{1}norm.

In the tests for Fig. 6 d and e, the following userdefined parameters have been randomised: selection method, optimisation objective (i.e. data from Fig. 6 ac that model is compared with), maximum number of genes allowed in each network, size of network populations, number of allowed network & parameter mutations, parameter mutation method (global or local), probability of gene addition, deletion or mutation, and probability of added, deleted or mutated network connection. In each of the 1933 successful EA runs, each option for the userdefined parameters (e.g. there are three types of selection method possible: proportional, semiproportional, and elitist) had equal probability of being selected.
Termination
 1.
If the score of the fittest network has not changed for 250 generations, i.e. the optimisation has converged to a solution, and;
 2.
The maximum number of generations (2500) has been evaluated without convergence.
Whilst it is possible that the fittest network obtained after 2500 generations is only suboptimal, studies have shown that elitist EAs have the capability to find the global optimum within a finite number of generations [38].
Programs and solvers
The algorithm is written in Python version 2.7 (Python Software Foundation, www.python.org). Implementation details can be found in the Additional file 1. The source code is available at https://gitlab.com/wurssb. The GMA models are simulated deterministically using ODEINT from the Scipy package or stochastically using the Gillespie algorithm [39]. However, whilst we have provided code to simulate systems using the Gillespie algorithm, note that this greatly increases the computational time of the EA.
Results
Benchmarking algorithm performance
Before applying our EA for the purposes of network design, we wished to determine how algorithm performance depends on the userdefined settings and scoring functions. Subsequently, we performed two tests (see Additional file 1). First, we calculated the duration of computational time taken to simulate a generation of oscillating networks of different sizes for 6000 timepoints. Second, we randomly selected algorithm parameters for 1933 successful EA runs (see “Methods” section). Based on these results we could determine which algorithm settings had the largest influence on the results and are optimal for use.
where Δ _{final} and Δ _{initial} are the scores Δ of the fittest individual in the final and initial generations, respectively. Thus, if the optimal score is much less than the initial network used in optimisation then the convergence ratio will tend to zero, but if the optimisation process does not converge then the ratio will be 1 or greater. By correlating the convergence scores with the presence or value of a particular algorithm property, we found that the selection method (bar 11), had the largest influence on algorithm performance (Fig. 6 d). For example, the 25 EA runs with the highest convergence scores (i.e. the optimisation routine did not function as desired) all employed the proportional selection method (Fig. 6 e). Conversely, the best performing 25 EA runs with the lowest convergence ratio employed a mix of selection criteria with elite selection being slightly favoured. Thus, the convergence ratio is correlated with the presence of proportional selection criteria. Similarly, the rate of parameter mutation (Fig. 6 d, bar 10) and probability of adding network connections (Fig. 1 d, bar 9) also had a strong influence on whether or not the EA converged to an optimal solution.
Parameter optimisation of Repressilators provides targets for directed evolution studies
To determine which parameters have the largest impact on determining whether a system oscillates or not, we plotted the objective score against individual parameter values (Fig. 7 d). This analysis allowed us to determine, based on the steepness of the relationship between parameter value and score, which rates had the largest impact on determining whether a system was more likely to oscillate or not. The results highlight that binding rates between the proteins within the Repressilator are an important factor that determines the presence of oscillations. The importance of posttranslational protein cooperativity on oscillating systems has been noted previously [21, 26]. Given the nonlinear relationship between components within the Repressilator system, it is likely that combinations of parameters have a larger impact on oscillations than singular parameter perturbations. Further evaluation of the top 25 parameter sets showed that pairs of parameters with highest correlation to the score were production and degradation rates (Fig. 7 e and Additional file 1: Table S9). Thus, tuning production and degradation would influence the presence of oscillations as has been observed in the initial Repressilator design and optimisation studies since [3, 14, 32].
Based on our analysis one could envisage that specifically tuning the protein binding affinities would increase the chance of observing oscillations. By only allowing for the targeted mutation of interaction rates between Repressilator components we determined the number of generations required for the EA to find an oscillating parameter set. As observed in Fig. 7 f, the average number of generations from 272 successful EA runs required to produce oscillating systems is 3.5 times lower when the binding rates are specifically targeted. One could, potentially, test such a scenario experimentally using directed evolution to select for systems with weak interaction rates [40, 41].
Network optimisation improves objective scores for oscillations
Interestingly, by removing constraints on network size we obtained similar results with regards to the likelihood of finding an oscillating system with both single and multifunctioning TFs (Fig. 8 c). However, by comparing how the percentage of oscillating networks depends on network size, one can see that having TFs with multiple functions improves the likelihood of oscillations in larger networks (Additional file 1: Figure S4). Consequently, the likelihood of oscillations does not depend on network size, but on the function of system components, as has been suggested previously [26].
Finally, we looked at the likelihood of oscillating networks being generated by a posttranslational network (i.e. there is no gene regulation, Fig. 8 d). A classic example of a posttranslational oscillator is the cyanobacterial circadian clock that functions due to phosphorylation cycles [42]. These results show that the probability of finding stable or damped oscillations decreases compared to transcriptionaltranslation systems. Thus, one could speculate that transcriptiontranslation networks produce more robust oscillations compared to posttranslational oscillators, as has been observed previously [43]. One potential underlying cause for this is the increased nonlinearities and timedelays maintained within a system that incorporates both transcriptional regulation and posttranslational processes rather than protein interactions alone [26, 28].
Improving the robustness of oscillating networks
To counter this, we wondered whether robustness could be used as an objective for our EA. Due to each parameter of each network needing to be perturbed, the computational time increases dramatically. Thus, we limited our routine to 4 EA runs as a proof of principle (red dots, Fig. 9 a). The networks that were evolved specifically for robustness showed an increase in the fraction of perturbed networks that still maintained oscillatory behaviour. As a comparison of a robust and sensitive oscillating network, we simulated the most robust oscillator (Fig. 9 b) and a sensitive network both deterministically (Additional file 1: Figure S5A) and stochastically using the Gillespie algorithm [39]. The power spectra of the robust network was similar both when calculated from deterministic and stochastic trajectories with a single dominant peak (Fig. 9 c). Notably, within this system, there is not a dominant set of parameter pairs that determine oscillatory behaviour, suggesting that the presence of oscillations may be more likely caused by structural features (Additional file 1: Figure S6). However, for a sensitive system, the dominant peak seen in deterministic simulations is lost when simulated stochastically (Fig. 9 d). This is indicative of internal fluctuations of the network obscuring the underlying low amplitude system dynamics (Additional file 1: Figure S5B).
Multiobjective analysis provides a set of oscillators with different properties
In Fig. 10 c—e, we highlight the network structures that lie along the Pareto front (a Python script to simulate these systems can be found in Additional file 2). Interestingly, each of these designs are extensions of minimal motifs that have been studied previously. First, the optimal structure for high amplitude, short frequency oscillations is a small extension to the Repressilator system (Fig. 10 c). Second, to generate an oscillation of medium amplitude and frequency, a network consisting of coupled positive and negative feedbacks is Pareto optimal (Fig. 10 d). This network has been studied previously in the context of bistable motifs to determine how feedback strength can lead to systems switching between steadystates [44]. Finally, in Fig. 10 e, the Pareto optimal solution for low amplitude and long frequency oscillations is, to the best of our knowledge, a novel network design built upon a negative feedback loop.
Feedforward loops have distinct responses to pulsed inputs
We next asked whether the analysis can be expanded to incorporate all FFLs by evolving parameters and networks connections, whereby different sets of connections (activation/inhibition) are related to different FFL topologies [4]. Additionally, the types of external inputs into the networks (continuous or pulsed) and the regulatory logic gates (AND or ORgate) were varied. Interestingly, the results differ depending on the input signal into the system and the type of logic gate employed within the network (Additional file 1: Figure 7). In the case of continuous inputs, it is hard to observe a relationship between FFL topology and functionality (Additional file 1: Figure S7A, B). However, with pulsed inputs, different FFL topologies are clustered around different regions of the Pareto front, whereby coherent FFL 4 and incoherent FFL 1 possess opposite relationships between system sensitivity and precision (Additional file 1: Figure S7C, D). By analysing this Pareto clustering effect for more motifs, researchers can look to design families of networks with highly specified or tunable functions.
Discussion
EA implementation expands on previous strategies
In this work we have presented a generalised multiobjective network optimisation routine based on the principles of EAs for the purpose of designing synthetic systems (Fig. 2). By combining a schema description (Fig. 1) of model nodes with GMA kinetics (Eq 6), the evolving networks and reaction rates can be simply recombined (Fig. 3) and mutated (Fig. 4). The schema structure that we have employed allows one to track each generation of networks and the mutations that take place. Thus, an evolutionary path of a network from an initial design to a system with specific properties can be found. Furthermore, our algorithm can be easily adapted for further biological test cases through editing of the dictionaries that encode and simulate the networks (see Additional file 1). Importantly, our methodology is one of the first to describe recombination of networks using crossover/sexual reproduction, whereas many previous studies have employed asexual network recombination [21, 22, 30, 32]. This allows for a larger region of the search space to be explored by increasing the variability between individuals and across generations, as has been discussed by evolutionary biologists [45].
With this in mind, we analysed the performance of the algorithm given different settings and objective functions (Fig. 6). We found that the choice of selection method and objective function had the largest influence on the EA?s ability to find optimal networks even when other userdefined choices were randomly selected (Fig. 6 d, e). To assess the functionality of our EA, we initially showed that our EA performs well as an optimisation routine and is able to find parameter sets that generate desired timeseries dynamics in the test cases (Fig. 6 a). Similar observations are made when we allow the optimisation process to incorporate new network nodes and connections and is likely due to the termination criteria used in this study (Fig. 6 b, c). These results show that our EA functions as expected and that, given a target phenotype to match, the EA finds a network and parameter set that is quantitatively similar to the input data. Thus, as well as being used as an algorithm to design networks (see below), our algorithm can be used to reverse engineer underlying networks given specific datasets. Researchers will, therefore, be able to compare different models that yield the same response and elucidate the key design principles within these systems.
Designing oscillatory networks with desired properties
Oscillating biological networks have garnered a lot of attention as researchers attempt to understand the important principles behind their emergence and how similar systems can be constructed [3, 14, 21, 25–29, 32]. The underlying design principles of biological oscillations include having an appropriate number of system components with nonlinear interactions such that there is a sufficient time delay in the network [28]. A classic example of a synthetic tool that satisfies these criteria is the Repressilator [3]. Initial explorations to optimise this system focussed on tuning production and degradation rates whilst, more recently, sources of noise have been reduced [3, 7, 14, 32].
In our work we have expanded the tuning possibilities of the Repressilator to include all system parameters (Fig. 7, Additional file 1: Figure S3 and Table S9). We subsequently found that, as well as production and degradation rates of components, the binding affinities between proteins in the Repressilator system have a key role to play in the generation of oscillations (Fig. 7 d and e). The importance of protein dynamics on oscillations has been previously noted by [21, 25, 26]. Furthermore, we found that the most robust networks are primarily based on the Repressilator as has been studied previously (Fig. 9) [7, 27]. Thus, by confirming the observations of previous studies, we are confident that our algorithm accurately finds networks with desired properties.
Finally, we introduced, for the first time, a multiobjective analysis of oscillating gene networks. By optimising oscillating networks to produce rhythms of varying amplitude, period and shape, we obtained a Pareto front along which one can find optimal networks for different combinations of amplitude and frequency (Fig. 10). The networks that lie along the Pareto front provide a set of motifs with different responses, thus designers of future synthetic networks can take advantage of this by choosing the system that produces the required set of properties. Notably, we found that some of these optimal networks share core structural motifs with oscillating mechanisms that have been previously studied. For example, the core motif for an oscillator we found to have low frequency and high amplitude is a Repressilator network (Fig. 10 c). Whilst networks that have an average frequency and magnitude can be based on coupled autoregulatory positivenegative feedback loops (Fig. 10 d). Consequently, one can move within ‘design space’ between the two networks by adding a new component to the system and tuning the parameters to generate shorter frequency oscillations. To show the generality of these steps, we also obtained a general Pareto front containing all feedforward motifs, highlighting that FFLs form clusters along the Pareto front depending on the conditions being analysed (Additional file 1: Figure S7). This is, thus, a generalised analysis of that presented in [33], who concentrated on one of the eight possible FFL motifs. We envisage that producing Pareto fronts of synthetic networks will allow for an easier overview of potential mechanisms that can be constructed experimentally to achieve desired needs.
Applications to evolutionary research
As well as for the purposes of network design and reverse engineering synthetic systems, we believe that this approach can also be beneficial for those that are interested in the evolution of biological systems. By analysing the evolution of networks, one may be able to understand how complex systems have arisen to relate changing environments with observable phenotypes [8, 9, 22, 46]. Such an idea has been previously studied in relation to circadian clocks and understanding how a complex mix of transcriptional feedback loops and posttranslational networks leads to 24 h rhythms across taxa [29]. We believe that our EA provides a general and easy way of tracking the evolution of networks towards a particular objective using the schema structure we have employed. Consequently, one could build evolutionary trees of network motifs (see Supplementary Information [46]). This opens up a number of interesting questions. For example, given an initial autoregulatory feedback loop, how could multiple circadian clock mechanisms have been reached via evolution? What is the probability of evolution producing a mechanism primarily made up of posttranslational mechanisms (as in mammals) compared to clocks formed by transcriptional feedback loops (as in plants) [47, 48]? How many other network motifs can produce the properties observed by circadian clocks in nature? Similar such questions could be posed for other biological examples.
Further to understanding the development of biological systems across evolution, our EA can also be used to simulate laboratory evolution [40, 41]. By optimising the synthetic Repressilator network, we found that protein binding rates are key to producing stable oscillations (Fig. 7 e). Consequently, by targeting these parameters for evolution, we were able to obtain populations of oscillating systems more quickly than when the whole parameter set is evolved. Based on our results using the test case of the Repressilator, one could use directed evolution to influence protein dynamics and obtain oscillations. Recently, PotvinTrottier et al. have improved the functionality of the Repressilator by reducing sources of noise in the system [7]. Notably, this involved altering the system such that protein dynamics (degradation and binding to the promoter) were less ‘leaky’. By further incorporating control over protein interactions, one could potentially improve the robustness of the Repressilator further.
Conclusion
In this work we have presented a multiobjective EA for the design of synthetic networks. Our EA brings together and extends upon previously studied network design strategies [10, 21, 22, 30, 32]. We have shown the generality of our approach by using the EA to evolve networks towards concentration profiles, oscillating dynamics and signal responses. Furthermore our EA can aid one in understanding how biological systems have evolved from simple motifs to more complex networks in the face of changing environmental conditions. Hence, we believe that our optimisation strategy is ideal to reverse engineer novel networks that satisfy particular constraints.
Declarations
Acknowledgements
The authors would like to thank Ruben van Heck & Emma Keizer for critical reading of the manuscript, and Philip Winkler & Henry Ehlers (Wageningen UR, The Netherlands) for helpful discussions and preliminary studies.
Funding
RWS is funded by FP7 Marie Curie Initial Training Network grant agreement number 316723 and EU Horizon 2020 grant agreement number 634942. CF is supported by HFSP Research grant RGP0025/2013.
Availability of data and materials
The computational algorithm for this study can be found at https://gitlab.com/wurssb/Evolutionary_Algorithm_Network_Design.
Authors’ contributions
RS and CF designed the study. RS, BvS, and CF wrote the paper. BvS wrote the computational algorithm. RS and BvS performed analysis of results. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Milo R, ShenOrr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Network motifs: Simple building blocks of complex networks. Science. 2002; 298:824–7.View ArticlePubMedGoogle Scholar
 YegerLotem E, Sattath S, Kashtan N, Itzkovitz S, Milo R, Pinter RY, et al.Network motifs in integrated cellular networks of transcriptionregulation and proteinprotein interaction. Proc Natl Acad Sci USA. 2004; 101:5934–9.View ArticlePubMedPubMed CentralGoogle Scholar
 Elowitz MB, Leibler S. A synthetic oscillatory network of transcriptional regulators. Nature. 2000; 403:335–8.View ArticlePubMedGoogle Scholar
 Mangan S, Alon U. Structure and function of the feedforward loop network motif. Proc Natl Acad Sci USA. 2003; 100:11980–5.View ArticlePubMedPubMed CentralGoogle Scholar
 Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in Escherichia coli. Nature. 2000; 403:339–42.View ArticlePubMedGoogle Scholar
 Tsai TYC, Choi YS, Ma W, Pomerening JR, Tang C, Ferrell JE. Robust, tunable biological oscillations from interlinked positive and negative feedback loops. Science. 2008; 321:126–9.View ArticlePubMedPubMed CentralGoogle Scholar
 PotvinTrottier L, Lord ND, Vinnicombe G, Paulsson J. Synchronous longterm oscillations in a synthetic gene circuit. Nature. 2016; 538:514–7.View ArticlePubMedPubMed CentralGoogle Scholar
 Huang S. The molecular and mathematical basis of Waddington’s epigenetic landscape: A framework for postDarwinian biology?. Bioessays. 2011; 34:149–57.View ArticlePubMedGoogle Scholar
 Jaeger J, Monk N. Bioattractors: Dynamical systems theory and the evolution of regulatory processes. J Physiol. 2014; 592:2267–81.View ArticlePubMedPubMed CentralGoogle Scholar
 OteroMuras I, Banga JR. Multicriteria global optimization for biocircuit design. BMC Syst Biol. 2014; 8:113.View ArticlePubMedPubMed CentralGoogle Scholar
 Dasika MS, Maranas CD. OptCircuit: An optimization based method for computational design of genetic circuits. BMC Syst Biol. 2008; 2:24.View ArticlePubMedPubMed CentralGoogle Scholar
 Huynh L, Kececioglu J, Koppe M, Tagkopoulos I. Automatic design of synthetic gene circuits through mixed integer nonlinear programming. PLOS ONE. 2012; 7:e35529.View ArticlePubMedPubMed CentralGoogle Scholar
 Feng XJ, Hooshangi S, Chen D, Li G, Weiss R, Rabitz H. Optimizing genetic circuits by global sensitivity analysis. Biophysical J. 2004; 87:2195–202.View ArticleGoogle Scholar
 Chen BS, Chen PW. GAbased design algorithms for the robust synthetic genetic oscillators with prescribed amplitude, period and phase. Gene Regul Syst Biol. 2010; 4:35–52.Google Scholar
 Nevozhay D, Adams RM, van Itallie E, Bennett MR, Balazsi G. Mapping the environmental fitness landscape of a synthetic gene circuit. PLOS Comput Biol. 2012; 8:e1002480.View ArticlePubMedPubMed CentralGoogle Scholar
 Rodrigo G, Carrera J, Elena SF. Network design meets in silico evolutionary biology. Biochimie. 2010; 92:746–52.View ArticlePubMedGoogle Scholar
 Francois P. Evolving phenotypic networks in silico. Semin Cell Dev Biol. 2014; 35:90–7.View ArticlePubMedGoogle Scholar
 Sirbu A, Ruskin HJ, Crane M. Comparison of evolutionary algorithms in gene regulatory network model inference. BMC Bioinformatics. 2010; 11:59.View ArticlePubMedPubMed CentralGoogle Scholar
 Simon D. Evolutionary Optimization Algorithms. Hoboken: Wiley; 2013.Google Scholar
 Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multiobjective genetic algorithm: NSGAII. IEEE Trans Evol Comput. 2002; 6:182–97.View ArticleGoogle Scholar
 Francois P, Hakim V. Design of genetic networks with specified functions by evolution in silico. Proc Natl Acad Sci USA. 2004; 101:580–5.View ArticlePubMedPubMed CentralGoogle Scholar
 Paladugu SR, Chickarmane V, Deckard A, Frumkin JP, McCormack M, Sauro, HM. In silico evolution of functional modules in biochemical networks. IEE Proc Syst Biol. 2006; 153:223–35.View ArticleGoogle Scholar
 Guantes R, Estrada J, Poyatos JF. Tradeoffs and noise tolerance in signal detection by genetic circuits. PLoS ONE. 2010; 5:e12314.View ArticlePubMedPubMed CentralGoogle Scholar
 Jin Y, Meng Y. Emergence of robust regulatory motifs from in silico evolution of sustained oscillation. BioSystems. 2011; 103:38–44.View ArticlePubMedGoogle Scholar
 van Dorp M, Lannoo B, Carlon E. Generation of oscillating gene regulatory network motifs. Physical Review E. 2013; 88:012722.View ArticleGoogle Scholar
 Noman N, Monjo T, Moscato P, Iba H. Evolving robust gene regulatory networks. PLoS ONE. 2015; 10:e0116258.View ArticlePubMedPubMed CentralGoogle Scholar
 Woods ML, Leon M, PerezCarrasco R, Barnes CP. A statistical approach reveals designs for the most robust stochastic gene oscillators. ACS Synth Biol. 2016; 5:459–70.View ArticlePubMedPubMed CentralGoogle Scholar
 Novak B, Tyson JJ. Design principles of biochemical oscillators. Nat Rev Mol Cell Biol. 2008; 9:981–91.View ArticlePubMedPubMed CentralGoogle Scholar
 Rodrigo G, Carrera J, Jaramillo A. Evolutionary mechanisms of circadian clocks. Central Eur J Biol. 2007; 2:233–53.Google Scholar
 Feng S, Ollivier JF, Swain PS, Soyer OS. BioJazz: in silico evolution of cellular networks with unbounded complexity using rulebased modeling. Nucleic Acids Res. 2015; 43:e123.View ArticlePubMedPubMed CentralGoogle Scholar
 Feng S, Ollivier JF, Soyer OS. Enzyme sequestration as a tuning point in controlling response dynamics of singalling networks. PLoS Comput Biol. 2016; 12:e1004918.View ArticlePubMedPubMed CentralGoogle Scholar
 Chang YC, Lin CL, Jennawasin T. Design of synthetic genetic oscillators using evolutionary optimization. Evol Bioinforma. 2013; 9:137–50.Google Scholar
 Boada Y, ReynosoMeza G, Pico J, Vignoni A. Multiobjective optimization framework to obtain modelbased guidelines for tuning biological synthetic devices: an adaptive network case. BMC Syst Biol. 2016; 10:27.View ArticlePubMedPubMed CentralGoogle Scholar
 Goodrich MT, Tamassia R. Graphs and traversals. In: Algorithm design and applications. Hoboken: Wiley: 2014.Google Scholar
 Voit EO. Biochemical systems theory: A review. ISRN Biomathematics. 2013; 2013:897658.View ArticleGoogle Scholar
 Chellaboina V, Bhat SP, Haddad WM, Bernstein DS. Modeling and analysis of massaction kinetics. IEEE Control Syst. 2009; 29:60–78.View ArticleGoogle Scholar
 Zitzler E, Deb K, Thiele L. Comparison of multiobjective evolutionary algorithms: empirical results. Evol Comput. 2000; 8:173–95.View ArticlePubMedGoogle Scholar
 Rudolph G. Convergence of evolutionary algorithms in general search spaces. Proceedings of IEEE International Conference on Evolutionary Computation; 1996, pp. 50–4.Google Scholar
 Gillespie DT. Stochastic simulation of chemical kinetics. Annu Rev Phys Chem. 2007; 58:35–55.View ArticlePubMedGoogle Scholar
 Brophy JAN, Voigt CA. Principles of genetic circuit design. Nat Methods. 2014; 11:508–20.View ArticlePubMedPubMed CentralGoogle Scholar
 Yokobayashi Y, Weiss R, Arnold FH. Directed evolution of a genetic circuit. Proc Natl Acad Sci USA. 2002; 99:16587–91.View ArticlePubMedPubMed CentralGoogle Scholar
 Paijmans J, Lubensky DK, ten Wolde PR. Period robustness and entrainability of the Kai system to changing nucleotide concentrations. Biophys J. 2017; 113:157–73.View ArticlePubMedGoogle Scholar
 Hosokawa N, Kishuge H, Iwasaki H. Attenuation of the posttranslational oscillator via transcriptiontranslation feedback enhances circadianphase shifts in Synechococcus. Proc Natl Acad Sci USA. 2013; 110:14486–91.View ArticlePubMedPubMed CentralGoogle Scholar
 Avendano MS, Leidy C, Pedraza JM. Tuning the range and stability of multiple phenotypic states with coupled positivenegative feedback loops. Nature Commun. 2013; 4:2605.View ArticleGoogle Scholar
 Crow JF. Advantages of sexual reproduction. Dev Genet. 1994; 15:205–13.View ArticlePubMedGoogle Scholar
 McGrane M, Charleston MA. Biological network edit distance. J Comput Biol. 2016; 23. doi:10.1089/cmb.2016.0062.
 Pokhilko A, Fernandez AP, Edwards KD, Southern MM, Halliday KJ, Millar AJ. The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops. Mol Syst Biol. 2012; 8:574.View ArticlePubMedPubMed CentralGoogle Scholar
 Kim JK, Kilpatrick ZP, Bennett MR, Josic K. Molecular mechanisms that regulate the coupled period of the mammalian circadian clock. Biophys J. 2014; 106:2071–81.View ArticlePubMedPubMed CentralGoogle Scholar