General model of the FFL
The analysis is focused on the most general FFL formed by three genes. We assume that the gene circuit acts as a functional unit responding to an external input by producing output. In figure (1) we show an schematic representation of the FFL, depicting three genes G
X
, G
Y
and G
Z
, with regulatory interactions among each other via their corresponding proteins X, Y and Z.
Gene G
X
(not shown) has a constant production of its protein which is independent of the regulation of the other proteins. We assume that X, however, is synthesized in its inactive form and needs an external effector (the input of the circuit) for its activation. The concentration of active X without input is negligible. Upon addition of effector, the activation of X proceeds very rapidly compared to synthesis and decay and hence, can be approximated by a step function
as in [28]. In the following, X will denote the active protein. Transcription of G
Y
and G
Z
is subjected to internal regulation. Production of Y depends on regulator X, whereas production of Z depends on both X and Y. The FFL can be described by the following ODEs
The parameter γ
i
represents the basal production of protein i, where i = {X, Y, Z}. In this parameter we subsume the concentration of all biochemical elements which remain constant in time. The degradation rate of protein i is denoted as d
i
. The binding equilibrium of the regulators j with the gene G
i
are denoted by
, with j = {X, Y, Z}. The tunable positive parameters αXand βjdescribe the type of regulatory interactions, i.e. activation or inhibition, for gene G
Y
and G
Z
, respectively, without any predefined specific assumptions. They provide the regulatory rates with respect to the basal transcription. Values < 1 correspond to inhibitory regulation, whereas > 1 accounts for activation. The parameter βXYaccounts for the simultaneous regulation of G
Z
.
Traditionally, studies on FFL dynamics have been performed under the assumption of Boolean logic [28, 42] for the control of the output regulation. The presented model includes all theses cases as specific subsets of parameters. For example, assuming a Boolean AND logic for a circuit, where the output Z is positively regulated by Y and X, is described by the following parameter configuration: βx= βY= 1, βXY> 1 and
. The same circuit displaying OR logic requires βx= βY> 1, βXY= 0 and
= 0.
Finally, n and m denote the degree of multimerization of the regulators. The presented results, however, are considering the general case independent of the size of the regulatory factors.
Nullclines' analysis: Changes of the nullclines upon input
The system's fixed points can be determined by inspection of the crossings of its nullclines. The studied input-output system will remain stable unless an external input pushes the FFL towards another stable equilibrium. The FFLs dynamical response upon this change depends on a specific configuration of circuit parameters. Notwithstanding, the possible response-dynamics are restricted by the shape of the nullclines. The general expressions for the nullclines are obtained imposing Ẏ = 0 and Ż [41] and written as follows:
For a general analysis of the nullclines' shape independent of the specific parameters and the size of the multimeric regulators, we focus on a simple set of geometrical features. Within the phase space spanned by {Y, Z}, the nullcline (5) is a simple, vertical straight line. Nullcline (6) shows an horizontal asymptote located at
and crosses the vertical axis at
Furthermore, expression (6) shows a single inflection point, i.e. a point where curvature changes from concave to convex or viceversa, in the biological domain defined by Y ≥ 0 and Z ≥ 0, but no extrema (local maximum or minimum). In order to understand how the input triggers the dynamical response, we study the configuration of the two nullclines with X = 0, i.e. no input, versus X > 0, i.e. with input. In figure (2) we show a numerical example. In absence of input, the system is governed by a single stable fixed point, denoted by ϕX = 0, located at the crossing of the nullclines (5) and (6). On addition of input, the fixed point moves, because the nullcline configuration changes. Now the system shifts from ϕx = 0to the new stable fixed point ϕx > 0. The point ϕX > 0is determined by the crossing of the new location of the nullclines for X > 0. The dynamical response of the system, which is represented by the trajectory of the FFL in phase space, is generated by this change of the stable regime. A subset of four parameters, i.e. {αX, βX, βY, βXY}, describing the interactions of the proteins, classify the type of the circuits into coherent and incoherent according to [28]. Without input, only βYis relevant for the geometry of the nullcline Ż = 0. If βY> 1 the nullcline rises, because Z
HA
> Z0, whereas for βY< 1 the nullcline decreases as shown in figure (3a). Under the presence of an external input the other parameters become relevant (figure (3b)):
-
1.
The regulation of Y by X, described by αX, defines the direction of the shift of the straight nullcline (5) regarding its earlier position for X = 0.
-
2.
The regulation of Z by X, described by βX, displaces the crossing of the nullcline (6) with the vertical axis to a higher (βX> 1) or a lower (βX< 1) value.
-
3.
Finally, the joint regulation βXYof Z by X and Y, strongly influences the location of the horizontal asymptote Z
HA
.
As we will see in the following sections the geometrical features generated by the biological interactions will form the basis for the dynamical behaviour of the FFL.
Response of the FFL
To understand how the changes in the nullclines upon input confine the feasible dynamical response of the circuit, we will focus our attention on two representative cases, namely a coherent and an incoherent type ([28]). We choose circuits C4 and I1, respectively, which differ only in one regulatory interaction in G
Y
, i.e. αX. Due to this interaction only nullcline (5) is affected and with it the direction of its shift as can be seen in figure (3b). The other interactions are the same for both circuit architectures (βx> 1, βY< 1) and hence the possible nullcline's shapes associated to expression (6) are also the same. In figure (4) we show their subset of changes realized by the nullcline. Based on the positive interaction of βXthe crossing point Z0|X > 0with the vertical axis always shifts toward higher values for cases with input compared to cases without input (Z0|X = 0).
Two biological scenarios for the joint regulation of G
Z
by Y and X are plausible: either the complex acts as an activator βXY> 1 as shown in figure (4a) or as inhibitor βXY< 1. In (4b) we show the scenario associated with the conditions βXY< 1 and βXY> βY. The nullcline moves down, but does not cross the original nullcline. In (c) conditions βXY< 1 and βXY<βYlead to a single crossing. The same conditions can lead to a double crossing of the nullclines, however our numerical analysis (data not shown) indicates that the probability to find an adequate configuration of parameters has very low probability (< 0.3%). For simplicity we discard these cases (see Methods for a detailed example of our analysis applied to nullclines with double crossing). By using the relatively simple configuration of the nullclines shown in (b) we already find different possible behaviour of the two circuits C4 and I1 due to their opposing αX.
Whereas for C4 the nullcline's (5) shifts to higher values, allowing only for grader trajectories (inset 1), I1 may show instead two different functionalities. The shift direction is to lower values and hence either grader or pulser can be implemented depending on the set of parameters (see inset 2). The fact that a range of feasible functional scenarios can be intuitively deduced, demands for a method of unambiguous discrimination to resolve the problem.
Separatrix
Sometimes, two qualitatively different dynamics are possible for a given set of nullclines (figure (4b)). In that case, if we consider any arbitrary initial point (Y, Z) in phase space (i.e. any arbitrary concentration values of Y and Z) phase space can be divided into two different areas. As shown in figure (5) this partition of phase space is characterized by a different functional behaviour of the trajectories upon input: starting in one part (yellow area) the trajectory will first join/intersect the nullcline Ż = 0 and then following the nullcline until it reaches the fixed point. Starting from an arbitrary point in the other region (grey area) the trajectory will only reach the nullcline in the fixed point ϕX > 0. This qualitative difference in the trajectories gives rise to a different functional behaviours of the circuit (see figure (6)).
The system without input is determined by its fixed point ϕX = 0. The circuit will exhibit a given dynamic behavior depending on which part of phase space ϕX = 0is located. The location of the boundary between the two dynamical areas, termed separatrix, allows classifying the functionality of the trajectory. The key elements for the discrimination of the different possible functionalities, for a given geometry of the nullclines, is the relative position of three points: the crossing of the vertical nullcline Ẏ = 0 without input, with: i) the nullcline Ż = 0 without input (ϕX = 0), ii) the separatrix (S) and with iii) the nullcline Ż = 0 with input (ξ). We illustrate this with an example (circuit I1) in figure (7a-c) where the relative positions of the crossing points and their implication for the time course is outlined. An analytical estimation of the separatrix and all the possible combinations of these elements with its resulting dynamics are presented in Supplementary Materials.
Our analysis can be generally applied to all three-gene FFLs. We have shown that both examples show enough plasticity in the dynamics to implement different type of response. In the next section we will focus on the probability of emergence of the different feasible types of dynamics.
Probability of emergence of different FFL's dynamics
In the previous section we have shown that in both examples, C4 and I1, more than one possible dynamic behavior can be obtained depending on the specific set of parameters chosen. Generally we can list six types of dynamics, namely positive or negative graders (G+, G-) and positive or negative pulsers (P+, P-). Grader response corresponds to an increase (+) or decrease (-) of the concentration of Z characterized by a transient from the initial to the final state where the concentration of Z never is higher (G+) or lower (G-) than the final concentration. In general, grader responses are related with responses able to filter noise and respond only on absolute certainty [24]. On the other hand, a pulser response is characterized by a rapid increase (P+) or decrease (P-) of the concentration of Z reaching higher (+) or lower (-) values of Z before they reach the final state. Note that for the pulsers independently of the pulse direction (P+ or P-) the final concentration of output protein Z can be higher (T+) or lower (T-) than the initial concentration. Hence we separately analyze four different pulser dynamics, namely P+(T+), P+(T-), P-(T+), P-(T-). The time courses of the different functionalities are outlined in figure (6). A specific subset of dynamics can be determined for each FFL frequently containing functions, which cannot be implemented. Based on this results the main question is: Are all the feasible dynamics equally probable? To address this question we have performed an analytical study of the parametric requirements necessary to implement a given type of dynamical response.
Backbone of requirements for the FFL response
We can deduce geometrically all possible sets of trajectories. We discriminate the necessary parametric conditions for the emergence of one specific dynamic among all the possible, due to the relative position of the initial point ϕX = 0of the trajectory in relation to the separatrix S, ξ and the specific shape of the nullclines. The parametric conditions, which determine the shape of the nullcline, can be systematically formalized. The key geometrical elements of the nullclines can be described by a set of exclusive parametric combinations defined in a string, which we call the backbone of requirements. Each position in this sequence contains the solution for two possible parametric states.
The example shown in figure (7) is meant to illustrate the procedure. The cases in (a-c) display the same geometrical shape and can be described by a single backbone of requirements shown in (d). Note that not all the elements of the parametric sequence need to be defined explicitly. In certain occasions some of them are uniquely defined by previous conditions, denoted by *, or otherwise do not have an impact on the geometrical scenario, represented by '∅'. For example position one is denoted '∅'. If condition two (βXY> βY) is satisfied, position one is not relevant, because both solutions (βXY> 1 or βXY< 1), provide the same geometry. On the other hand position four, denoted as '*', is always solved as '>', because it can be deduced from the second condition given that βX> 1.
Quantification of the dynamical probabilities
Within the framework of our model, we assume that the feasibility of given dynamical behaviour (transient response) is heavily dependent on the number of exclusive, parametric configurations necessary for the realization of this behaviour. These are subsumed in the backbone of requirements. We have generated all possible backbone sequences for the circuits C4 and I1. For each of these sequences the separatrix discriminates among different dynamics as we have shown in figure (7a-c). In circuit C4 we have found 18 different sequences implementing five different types of dynamics, where as for circuit I1, 21 sequences implement four different behaviors. The detailed list of backbone sequences are shown in Supplementary Materials. The total number N
C
of possible requirements captured within one sequence i within a circuit k = {C 4, I 1} is constant and follows the sum:
Here,
represents the number of requirements having no impact on the nullclines' geometry. For this kind of elements of the sequence, both possible solutions ('>' or '<') are valid and hence discrimination is unnecessary. Therefore,
different combinations of parameters are described by the same backbone sequence providing the same geometry for the nullclines. The second term
, is the number of elements predefined by other conditions of the backbone. Finally,
is the number of requirement actually necessary to determine univocaly the shape of the nullcline. In other words, in order to implement a determined backbone sequence it is sufficient to properly establish the conditions
.
Once these conditions are set, the rest of the sequence is determined. This allows to calculate the probability for a given set of parameters to implement a certain backbone sequence i as
For a given FFL topology it is possible to find different sets of parameters compatible with such topology but implementing different dynamics among the six possibilities. Each set of parameters fits in a specific backbone sequence. For each backbone sequence there exist three different relative positions of the separatrix discriminating among the different possible dynamics. This idea is illustrated in figure (7). The same backbone sequence shown in 7d can implement two different dynamics, namely G+and P+ T +. The location of the separatix shown in figures (7a-c) depends on the numerical values of the parameters. If these specific values locate the separatrix as depicted in figure (7a) the FFL motif will implement a grader dynamics, whereas other locations (see figures (7b, c)) will provide a pulser dynamics. The probability of circuit k to implement a given dynamic j = {G+, G-, P+(T+), P+(T-), P-(T+), P-(T-)} described by the backbone sequence i can, thus, be calculated as:
where Ψ
ijk
is the number of equal dynamical outcomes j for a given backbone i in the k-thFFL and T
jk
the total number of backbone sequences implementing dynamic j. The normalization constant Ω
k
is defined by imposing the condition
and gives:
By calculating the probability of implementation of the different types of dynamics for the different FFLs we obtain an interesting picture regarding our initial question, whether the topology of the FFL implements its function. As shown in figure (8a) the coherent FFL C4 implements five different types of dynamics. Their probability (given an arbitrary set of parameters) is however very different. We find that C4 is significantly specialized for G+ dynamics. The degree of specialization is equally reflected in its robustness versus parametric change. On the other hand, the incoherent FFL I1 potentially implements four different types of dynamics, as shown in figure (8b). The probabilities for I1 to implement these functionalities are within the same range, unlike in C4. In figure (8c) the distribution of probabilities of C1 (the most abundant FFL) obtained performing the same analysis is shown. The specific topology allows for flexibility at the cost of less specialization. We see that the topology of the FFL does not implement its function, but instead the probability of a certain function to arise. If, for a given FFL, the distribution accentuates a certain function, the FFL is said to be specialized for this function. If the probabilities are located within the same range, the topology implements flexibility. Both aspects are equally relevant in terms of adaptation and evolvability.
The highest level of specialization of a given motif would correspond to a distribution of probabilities displaying a single peak, whereas the maximum level of flexibility would correspond to a flat distribution of probabilities, where all the potential dynamics would be equally probable. In order to compare the level of specialization of different FFL topologies we propose to measure the peakedness of the probability distribution, i.e. the kurtosis [43] (see Methods for details about kurtosis). The values of kurtosis are KC 1= 4.6, KC 4= 7.8 and KI 1= 1.28, respectively indicating that C1 has an intermediate level of specialization with respect to C4 (more specialized) and I1 (more flexible). The same qualitative distributions are obtained by numerical simulation choosing random sets of parameters and counting the frequency of each dynamics (data not shown).