Previous works [6, 21] have hypothesized that GRNs have evolved towards maximizing temporal pairwise mutual information between the genes' expression levels, as a means to increase their degree of coordination by increasing the amount of information propagation between them. From global gene expression measurements following gene deletion and overexpression, we inferred the topology and logic of a core gene network of S. cerevisiae, and then simulated its dynamics using the Boolean network modeling strategy. The study of the input-output distribution showed that more genes have a very high number of inputs than expected by chance given the mean K, and that these genes have transfer function with p-bias close to 1 or 0. We hypothesize that these genes are preferentially regulated by a few of its TFs (under rich medium conditions), the others only being relevant in their absence or in adverse conditions. This agrees with the fact that only a small fraction of single TF deletion mutants in S. cerevisiae are lethal .
Another possible, mutually compatible explanation is that the "minor TFs" have overlapping functions. Possible approaches to investigate this include performing similar deletion experiments under conditions closer to those found in the wild, or examining multiple deletion mutants for lethal phenotypes, for genes whose single delation is non-lethal.
Contrary to what would be expected if the network was randomly wired, the inferred core network has a very high generalized clustering coefficient. This is known to enhance the ability of networks to propagate information . However, another interpretation is possible for the high C
. Namely, the GRN may have evolved a high C
because it needs many clusters of small number of genes to perform specific functions that require a high degree of coordination.
Finally, we found that although the average p-bias of the transfer functions is almost unbiased, the p-bias distribution resembles a beta-like distribution with high variance, far from what is expected by chance. Because of this, although with a very high connectivity, the core network is near critical, which is known to enhance information propagation .
We do not know what is the cause for the high variance of the p-bias distribution. It may be merely a consequence of the inability of genes to realize complex transfer functions. In that scenario, it would be more of a hinderance in its capacity to transfer information, rather than an advantage.
The high mean connectivity and near to 0.5 mean p-bias would, however, cause the network to be "chaotic" if the distribution of p-bias was not beta-like with high variance, allowing the sensitivity to be approximately 1. Because of this, we hypothesize that the shape of p-bias distribution may have evolved to allow the core GRN of S. cerevisiae to be near the critical regime, consistent with the hypothesis that critical GRNs are naturally favored. The critical regime is the dynamical regime for which I is maximized .
Relevantly, in , it was found that critical RBNs, in comparison with ordered and chaotic ones, are those that best predict the measured distribution of genes whose activities are altered in several hundred knockout mutants of S. cerevisiae, supporting our finding that the core network appears to be near critical. Studies on other GRNs using different methods to assess criticality [8, 30, 31] have found them to be near critical as well.
We further found that the core network has a high C
. Since both features enhance information propagation within the core GRN, it may be that the maximization of propagation of information within GRNs is a general principle by which natural selection shapes the large scale topology and logic of GRNs. It is of relevance to state that while we compared the dynamics of the inferred core network with null-model networks with a random topology, we do not imply that the GRN of ancestors of S. cerevisiae had a more "random topology" than the present GRN of S. cerevisiae. From our results we can only conclude that the present core GRN of S. cerevisiae is able to propagate information throughout its nodes far more efficiently than standard random topologies, due to its "far from random" values of C
, K, and p-bias. We hypothesize that these features have been subject to selection and that, as a consequence, the present core GRN of S. cerevisiae is likely to be more efficient in propagating information throughout its nodes than its ancestors. Nevertheless, we cannot rule out the possibility that the present values of these "global topological" parameters result from a variety of different and independent evolutionary steps, acting at a small topological scale, which indirectly, also lead to an overall more efficient information propagation throughout the GRN.
Finally, we note that our findings are likely to rely, to some extent, on the choice of modeling strategy of GRN used (the "Boolean" approach). It will be of great interest to investigate the findings here reported using more realistic modeling strategies such as the delayed stochastic modeling strategy [4, 32], shown to match measurements of gene expression at the single RNA and protein level . For this to be possible, methods for quantification of information, noise, and sensitivity from stochastic temporal expression levels of RNA and protein, as well as the state of promoter (free for transcribing, bound by TFs, etc) need further development.