Rule-based multi-level modeling of cell biological systems
© Maus et al; licensee BioMed Central Ltd. 2011
Received: 3 June 2011
Accepted: 17 October 2011
Published: 17 October 2011
Skip to main content
© Maus et al; licensee BioMed Central Ltd. 2011
Received: 3 June 2011
Accepted: 17 October 2011
Published: 17 October 2011
Proteins, individual cells, and cell populations denote different levels of an organizational hierarchy, each of which with its own dynamics. Multi-level modeling is concerned with describing a system at these different levels and relating their dynamics. Rule-based modeling has increasingly attracted attention due to enabling a concise and compact description of biochemical systems. In addition, it allows different methods for model analysis, since more than one semantics can be defined for the same syntax.
Multi-level modeling implies the hierarchical nesting of model entities and explicit support for downward and upward causation between different levels. Concepts to support multi-level modeling in a rule-based language are identified. To those belong rule schemata, hierarchical nesting of species, assigning attributes and solutions to species at each level and preserving content of nested species while applying rules. Further necessities are the ability to apply rules and flexibly define reaction rate kinetics and constraints on nested species as well as species that are nested within others. An example model is presented that analyses the interplay of an intracellular control circuit with states at cell level, its relation to cell division, and connections to intercellular communication within a population of cells. The example is described in ML-Rules - a rule-based multi-level approach that has been realized within the plug-in-based modeling and simulation framework JAMES II.
Rule-based languages are a suitable starting point for developing a concise and compact language for multi-level modeling of cell biological systems. The combination of nesting species, assigning attributes, and constraining reactions according to these attributes is crucial in achieving the desired expressiveness. Rule schemata allow a concise and compact description of complex models. As a result, the presented approach facilitates developing and maintaining multi-level models that, for instance, interrelate intracellular and intercellular dynamics.
In computational modeling of cell biological processes, a formal representation, i.e. a model, of the dynamics of the system under study is the central subject of investigations. Cell biological models typically focus on the processes of molecules like proteins and small chemicals. However, in addition, dynamics at cell level, e.g. proliferation and differentiation of stem cells, and cell-cell interaction, influence these intracellular dynamics as well, just like such high-level dynamics are influenced by processes at the molecular level. This hierarchical organization and the causalities between different levels, i.e. from the lower to the upper (upward causation) and vice versa (downward causation), are universal characteristics of biological systems [1, 2]. Hence, multi-levelness has been identified to be an important and general principle of systems biology . Depending on the question that shall be pursued with the model, capturing processes that happen at different levels, e.g. proteins, individual cells, and cell populations, and their interrelations within the model is of relevance . The question is how can this multi-levelness be supported by modeling methodologies? We will pursue this question in the context of rule-based modeling.
In the past years, many different modeling languages have been introduced to support modelers in their task, for example [5–8]. The idea is to write down a model not directly mathematically, like in ordinary differential equations (ODEs) or stochastic processes, but in terms of a tailor-made syntax. A semantics is then provided that bridges the gap between what is written and the mathematical definition of its computation. A carefully designed syntax can increase the accessibility of models for discussion and presentation, especially for domain experts that are not extensively familiar with modeling and the underlying mathematical formalism. Formal modeling languages can also extend the flexibility in the choice of methods for model analysis, since more than one semantics can be defined for the same syntax (see [9–12] for some examples).
Chemical solutions, i.e. mappings from species to concentrations or alternatively to their discrete integer amounts, describe a model's state. A formal semantics can then be defined as mapping of chemical solutions and reactions to, for example, stochastic processes or ODEs [11, 15].
Many rule-based approaches, for instance [15–17], allow to describe species with attributes as well as rules with reactant patterns, i.e. by specifying structured molecules and rule schemata they allow to model basic reactions as an alternative to the entire network of all possible chemical species and reactions. Thus, due to rule schemata, complexity - in terms of the number of required rules - can be significantly reduced .
Let us illustrate this with the help of a simple example. Proteins, in particular those involved in signaling pathways, often show various sites for binding other molecules. Furthermore, modifications like phosphorylation or methylation at diverse sites might determine the ability for binding other molecules and thus influence the interaction pattern of a protein. Hence, combinatorial explosion easily leads to very complex network models with hundreds or even thousands of species and reactions. Consider a system of ten interacting proteins. One of them is a large scaffolding protein that might reversibly bind each of the other nine proteins at nine distinct sites. We assume each of the nine binding reactions to be independent from other bindings. This at first view quite simple model requires to define 521 (2n + n with n = 9) molecular species, i.e. distinct combinations of bindings, and 4608 reactions. Attributes and rule schemata help to deal with the system's complexity by specifying only the basic molecules and reactions. As the binding reactions are assumed to be independent from each other, each of them may be described individually without taking the state of other binding sites into account. By omitting such irrelevant information, one rule might then be translated into multiple basic reactions of a large network by which the model is kept small and manageable. Hence, a rule-based modeling language like BioNetGen  allows to model the above system by specifying a set of only ten molecules and nine reversible rule schemata instead of 521 molecular species and 4608 reactions. For a more comprehensive review of rule-based modeling and its advantages for formal descriptions of signal transduction pathways, we would like to refer to .
Summing up, due to schematic rules, the complexity of a model may be effectively reduced and an intuitive modeling metaphor along the lines of well-known chemical reaction equations facilitates the process of modeling and the accessibility of models. Consequently, the number of rule-based approaches to describe biochemical reactions has increased during the last years, e.g. [8, 11, 15, 16], and also an increasing number of publications can be observed that utilize rule-based languages for concrete modeling studies, e.g. [19–22].
In this paper, we identify concepts for supporting rule-based multi-level modeling. We show a realization of these concepts as part of ML-Rules, a modeling and simulation approach we developed. Thereby, we start with concepts that nearly all current state of the art rule-based approaches support to successively approach concepts that are obviously related to multi-levelness, i.e. nesting. Thereafter, an example in which intracellular and intercellular dynamics are combined will illuminate the role that each of these concepts play in supporting multi-level modeling. Finally, to complete the results and discussion part of the paper, related work will be revisited to discuss which of the identified concepts are already supported.
A very brief introduction to rule-based modeling is already given in the previous background section. In the following, we will focus on the concepts we use in our rule-based multi-level approach (ML-Rules). Their respective role in supporting multi-level modeling will be shown in the subsequent example model.
For first studies, we base ML-Rules on continuous time Markov chains (CTMCs). The semantics is discrete population-based, i.e. we work with natural copy numbers of identical species instead of real valued concentrations. The reason for a stochastic semantics lies in the observation that at higher levels of organization (like cells) no longer abundant numbers will be able to balance fluctuations as can be often observed at lower levels, e.g. proteins that are involved in metabolic pathways. And also when looking at levels further down in the hierarchy, e.g. gene regulatory processes, stochastic events may play a crucial role due to low copy numbers of involved species. Hence, stochasticity is often an essential feature for multi-level modeling of cell biological systems . However, it should be noted that stochasticity is not necessarily constrained to CTMCs, as sometimes at higher levels other than exponential time delays are required, e.g. normal distributions .
The basic building blocks of ML-Rules models are called species which may represent any object of interest, e.g. small chemicals, macro-molecules like proteins, or membrane bound cellular compartments. Each species has a name, e.g. A, and each name has a fixed arity ar ∈ ℕ0 that specifies the number of attributes of a species. Attributes are not restricted to a finite set of values and they may be of any kind of numerical value and textual string. For convention throughout this paper, species names start with a capital letter and attributes are written in a bold font type within parentheses behind the name. Parentheses are omitted if ar = 0. For example, A, A(1), A(0,1.67), and A(green,-15, true) are valid examples for a species A with ar(A) = 0, 1, 2, and 3 respectively. However, these examples are invalid when two or more of them are being used within the same model, as the arity of a species name is fixed and therefore may not vary between species with identical names, i.e. A in this case. Each defined combination of attributes is a distinct species, i.e. A(1, 1) and A(1, 2) share the same name but are different species.
A solution is a multiset of species, i.e. can be either a single species or a composition of multiple sub-solutions. The '+' is the delimiter symbol for composing multiple solutions and a solution can be also an empty set ∅. We write nA with n ∈ ℕ0 to refer to a solution which is composed of n identical copies of A (n is omitted if n = 1). For example, [2 A(1) + 4A(2) + B] describes a solution consisting of three different species with an amount of 2, 4, and 1 respectively.
For simplicity reasons, our syntax only allows for uni-directional rules. Thus, two complementary rules have to be defined for modeling reversible reactions. However, it would be straightforward to extend the syntax to reversible reaction rules if needed.
Rule schemata are a notational convenience, which uses variables to bind attributes of reactants. By doing so, each rule schema may encode for several rule instantiations, i.e. reactions. Let us take a reaction which converts a species A into another species B. Consider the situation, that A is an attributed species with an arity ar(A) = 1, but the attribute is of no interest for its reaction to B. Without schematic rules, we would need to specify one rule for each possible value of the attribute of A, which can be tedious and error-prone.
Besides such very simple variants, rule schemata can be also defined by employing expressions to specify attributes. For instance, a reactant pattern A(x) + A(2x) matches every solution where at least two As exist, one of which attribute's value is exactly twice the attribute's value of the other one. Expressions can be also used to specify the attribute of the products, e.g. the rule A(x) → B(2x) applied to a solution [A(2) + B(3)] would lead to [B(3) + B(4)]. Please note, arbitrary functions can be used for expressions (see also the section on implementation).
F is a constant that denotes a free binding site and νx creates a fingerprint-like unique value which does not already occur in the current model state. It is assigned to the products on the right hand side of the rule via variable x, i.e. in an instantiation of the rule, x is replaced by a newly created unique value which serves as identifier for this particular binding. This method for representing linkage of species is identical to private channels in the π-calculus [6, 26] and allows to model molecular complexes similar as can be done with rule-based languages that have explicit notions of complexation, e.g. [16, 17]. Moreover, once created, unique values can be used in a highly flexible manner, e.g. to describe bonds shared by more than two binding partners (like hyperedges in a graph) or across level boundaries (the concept of multiple levels will be introduced below). Species can also be marked with an unique indentifier to observe the dynamics of individual entities.
However, note that the approach also has some drawbacks compared to explicit notions of molecular bindings: first of all, without profound knowledge or decent annotation of the model, it might be difficult to find out whether certain attributes of species represent binding sites. The approach would also allow to reset just one species to its unbound state while its former binding partner remains unchanged. Therefore, the modeler is responsible for describing a correct model without such unrealistic dynamics. Furthermore, at least in the current implementation of ML-Rules, using identifiers for modeling links between species may slow down the simulation as the number of distinct species increases and therefore matching reactants may take significantly more time (see section on implementation).
The higher the rate of a rule, the more likely the rule will fire at a time calculated according to this rate. The kinetic rate can be a simple constant numerical value, for example, to describe a chemical reaction with constant speed, i.e. a zeroth-order reaction whose speed is independent of the amount of any chemical species. However, most reaction rules that describe biological systems need to take the amount of one or more reactant species into account for specifying correct system dynamics. Probably in most cases the kinetics of a rule follows the law of mass action, but in systems biology alternative kinetics, e.g. Michaelis-Menten kinetics for enzymatic reactions and Hill functions for describing cooperativity, are also frequently applied . That is why we allow for arbitrary reaction rates using mathematical expressions. Any kind of mathematical expression is allowed that evaluates to a non-negative numerical value, see also .
If the amount of A(x) in a given solution does not exceed T, the if-then-else expression evaluates to a kinetic rate of 0, which determines that the rule will not fire.
The features listed so far are not new. Species with attributes and schematic rules are standard features that can be found in nearly every rule-based language in the field of systems biology, e.g. [11, 15, 16]. The requirement for modeling biological systems with rate kinetics different from those following the law of mass action seems to be also widely accepted nowadays. BIOCHAM , LBS , and React(C) , for example, support arbitrary reaction rates. Also recent developments for the BioNetGen language now allow to specify user-defined rate law functions, and moreover, modeling of conditional expressions to support the construction of logical sequences of control . All together, they will play a role in supporting multi-level modeling.
However, a truly salient feature of multi-level modeling are hierarchies. Hierarchical structuring facilitates modeling of complex biological systems by defining them in terms of their components and the interactions that exist between them. Hierarchies help to structure the knowledge about a given system . In addition, they allow to describe multiple nested solutions similar to the multiple separated reaction compartments that can be found in biological systems, e.g. cells, organelles, and vesicles.
The ability to assign attributes to nested species allows to equip each hierarchical level with an own state that is not only determined by the enclosed species. Such high-level states are of particular interest for multi-level modeling as they allow to describe dynamic behavior similar to observations performed at different levels of organization. The later example model will illustrate this in detail.
describes a reaction from A(0) to A(1) under the condition that A encloses at least one species C. Please note, this rule may also match species where A(0) contains further species in addition to the C (no matter which and how many), for instance like in the previous example. However, such a sub-solution gets lost when the rule fires, as the product species contains just exactly one C. The same holds true for a potential sub-solution of C; the reactant pattern matches every species where a C is part of a sub-solution of A(0), but it says nothing about a sub-solution of C. Hence, if the reactant species C would contain further species, they would get lost as well.
Such bound solutions can be freely reused for defining the products, i.e. migration, copying, and merging of solutions are easy tasks. The problem of splitting is another matter, for which specific operations on solutions are needed, e.g. to split a solution equally into two new solutions. Whereas ML-Rules and its current simulator allow the use of arbitrary functions on attributes, the application of functions on solutions is not yet supported. However, an integration into ML-Rules requires only slight adaptations of the syntax and semantics and is therefore planned for the near future.
Later, we will provide a more detailed explanation of our multi-level approach based on a realistic biological system. With the help of this example, we will also motivate again the need for multi-level modeling and illustrate how to realize upward and downward causation.
Below we provide a basic description of the simulation algorithm. Please note that in JAMES II simulation algorithms are not designed as monolithic blocks. By using plug-ins, alternative sub-algorithms can be easily exploited and combined. It has been shown that the performance and suitability of algorithms depend to a large degree on the concrete model, that details (i.e. sub-algorithms) matter, and that a suitable configuration can significantly speed up simulation . In combination with methods that help to automatically select and configure simulators on demand, this type of simulation design supports a high flexibility for executing multi-level models. Therefore, the simulator is structured as follows.
Require: S, rules
for ru ∈ rules do
rts ← MatchReactants(ru, S)
rcts ← rcts | CreateReactions(ru, rts, S)
for r ∈ rcts do
r prop ← CalcPropensity(r)
reaction ← SSA(rcts)
for reactant ∈ reaction do
for product ∈ reaction do
MatchReactants selects all matching reactants of the selected rule schema ru. The next step is to instantiate the rule schema ru and grouping equivalent rule instances by calculating the reactions rtcs. Now the propensity is calculated for each reaction r. Please note, as now the number of reactants (that apply) is part of the reaction, calculating the propensity is only dependent on r. Also, all information needed is directly available for each product in r, such as bound attribute values or solutions that are used on the rule's product side. After that an SSA is invoked. We have so far integrated the Direct Reaction Method of Gillespie. The selected reaction is executed by removing reactants and adding products.
The complexity of MatchReactants for matching a reactant is where n denotes the number of species in the solution, m the number of species in one context and k the depth of nesting of the reactant. The complexity of CreateReactions is where l is the number of reactants of a rule. Some optimizations are employed, e.g. to restrict the search space for matching reactants. For example, when evaluating a rule A(4) → B, all As whose attribute does not equal 4 will not be considered for matching the reactants of a rule to a solution. However, most of the simulation efforts still goes into calculating rts, i.e. matching the rules (coarsely 50% of the overall calculation). Therefore, current efforts are dedicated towards developing alternative approaches for matching, e.g. integrating special index methods. To avoid time-consuming instantiation of all possible reactions, an alternative kinetic Monte Carlo simulation approach [30, 36, 37] based on individual particles rather than populations of identical species, might also be worth to explore for simulating ML-Rules models.
With CalcPropensity the propensity of each generated reaction is calculated using the specified expression and taking the context of the matched solution into account. This means that the propensity is adjusted according to the amount of possible contexts the matched solution is part of (see previous Section and Figure 2). Based on Java reflection, currently functions provided by Java can be used within the expressions. The integration of a library of own functions as a plug-in will be realized in the future.
The current simulator proceeds basically as a discrete event simulator. Therefore, only slight adaptations are required to support also events that are not distributed exponentially. This feature is important as many biological phenomena at higher levels are not necessarily exponentially distributed, as argued in . Also, as multi-level models operate often at different temporal scales, the calculation effort for simulating these models in a pure discrete event manner might easily become prohibitive. Therefore, hybrid simulation approaches, e.g. [23, 39], shall be exploited in the future.
With additional file 1 we provide a prototypical demo software tool comprising a model editor, simulator, a rudimentary line chart visualization, and simulation data export. The following examples as well as additional example models can be loaded and demonstrate the concrete syntax of ML-Rules. Its source code will be made available under an open source licence as part of a following JAMES II release at http://www.jamesii.org.
The presented model illuminates the importance to consider different levels of organization for modeling certain phenomena in cell biology and shows the previously introduced concepts at work. Particularly the extension of the model from a single cell to multiple cells illuminates the benefits of the presented rule-based multi-level approach.
As the model is not intended to be a contribution for fission yeast science, it may not be sound in each aspect and may not reflect the current level of knowledge about this system. Also certain parameters are simply estimated. However, we payed attention to presenting a realistic case study that shows how a rule-based multi-level approach like ML-Rules facilitates modeling of such systems.
The eurkaryotic cell cycle consists of four distinct phases: G1, S, G2, and M. During the first three phases, a cell is increasing in size and its DNA is replicated. At the end of the cycle, a cell enters the M phase (mitosis) and finally divides into two daughter (or sibling) cells. These major events of the cell division cycle are controled by certain proteins and the underlying regulation processes in fission yeast have been extensively studied [40–45].
In our example model, the regulation at protein level is based on an early model by Tyson . This deterministic continuous model consists of two proteins, cyclin and cdc2, that form a complex called maturation promoting factor (MPF) which in turn controls traversion through the cell cycle. Today, there exist much more detailed models of this system than the relatively simple Tyson model, e.g. [43, 47–49]. However, the purpose here is not to provide the most accurate model of yeast cell cycle control but to show why multi-level modeling is important for studying certain aspects of cell division and how it can be realized. In this sense, the Tyson model is well suited as it is simple but at the same time captures the essential dynamics.
Most reactions of this model follow the law of mass action and we do not discuss each rule in detail here. Instead, we would like to refer to the supplementary material where the whole model can be found (additional file 2). The interesting reactions from our point of view are the activation of MPF and the subsequent dissociation of this complex. Activation of inactive MPF, i.e. the dephosphorylation of its cdc2 subunit, is assumed to be an autocatalytic process:
The higher the amount of activated MPF (M A ), the higher is the activation rate; D tot is a model parameter that denotes the total amount of cdc2.
Please notice that it would be easy to increase the amount of cells here by putting two cells on the right-hand side of the rule. Later we extend the model in this way. However, as the interplay between a high-level state and processes at lower level is subject of our interest here, we keep the cell number constant and just mimic cell division by halving the cell volume. At this stage, the model comprises the intracellular processes shown in  and explicit downward causation where the state of a cell (its volume) influcences a process at lower level. However, as Figure 5B depicts, although the mean oscillation period is longer than before and thus closer to observed cell cycle durations, now it is highly variable and still significantly shorter than the mass-doubling time (T d ) of 116 minutes. Moreover, this leads to nonconformity of active MPA bursts and cell division times. Hence, to get lifelike oscillatory behavior and to achieve better accordance between protein peaks and division time, we need to further extend the model. Let us therefore take a look on how low-level states influence dynamics at the cell level, i.e. how intracellular dynamics trigger high-level events so that the cell traverses through the different cell cycle phases.
In all three cases of phase transitions, the content of the cell remains the same as the only change governed by the rules is dedicated to the cell cycle phases. The rules describe typical examples of upward causation, i.e. low-level states (the amount of certain protein complexes) determine dynamics at higher levels (the cell). They also show the necessity for flexible reaction constraints to model inter-level causalities.
Simulation results of this multi-level model show that the mean period of oscillations is now in accordance with the mass-doubling time T d (Figure 5C). Cell division may happen at volumes larger than 2 and consequently the cell cycle may take more time than T d , but if so this has implications for the next cycle. Due to the unusually large volume at birth, MPA activation occurs relatively fast and thus the following cycle tends to be a bit shorter than normal. In this way, upward and downward causation regulates both, the cell cycle duration and cell size homeostasis.
The results emphasize the role of multiple levels and their inter-relation in studying phenomena like cell division. Therefore, ML-Rules provides nesting of species and rates with arbitrary kinetics based on constraints. So far the model appears still to be manageable in other less expressive modeling approaches. However the next step, i.e. moving from a single cell model to multicellular dynamics, illuminates the importance to be able to express multiple levels and their inter-level causalities explicitly and flexibly.
The unicellular fission yeast may undergo sexual reproduction when environmental conditions are getting poor, e.g. when cells are starving. Different mating types (P and M) exist enforcing fusion of cells of opposite types only . The product of fusion is a diploid zygote which rapidly enters a sporulation process. Later, when the environmental conditions improve, spores germinate to spawn haploid cells which then undergo normal asexual proliferation again. The mating type of proliferating cells switches sporadically when a cell divides. This phenomenon is regulated by rather complex mechanisms at gene level [52–57]. However, rather stable phenomenological patterns of switching can be observed . One important characteristic is that cells do not only show one of the two different mating types P or M, but can be also categorized into cells that are able to switch their type and those that are not (Figure 6A).
The above rule describes cell division of an unswitchable cell (denoted by the U attribute). The complementary schema for division of switchable cells looks pretty much the same and is therefore not shown here. The only differences affect the last attribute of the reactant (S instead of U) and the assignment for the mating type of the unswitchable product cell: the conditional expression if t = P then M else P is assigned which makes the rule valid for matching both mating types P and M.
Now that we have introduced multiple instances of cells (each with potentially own behavior), it becomes clear that modeling of such systems becomes only viable due to the ability to specify rule schemata. Otherwise one would need to specify defined reaction rules for each potential species, i.e. for each combination of cellular attributes and intracellular protein amounts. This is highly impractical for smaller systems with a finite state space and impossible for systems like the presented one. It shows the importance of rule schemata for supporting multi-level modeling.
The above rule also illustrates the need for binding solutions to variables so that entire solutions can not only be preserved for the reactant species, but can be treated like any other variables and thus also be copied and placed into multiple product species. Unlike the volume of the dividing cell and similar to the single cell division rule in the previous section, the cell's content will not be splitted and distributed among both daughter cells. The contained sub-solution will be entirely copied instead, as (according to the original Tyson model) the total amount of cdc2 protein, i.e. the sum of the amount of species D, M I , and M A , is assumed to be constant in each cell. However, by applying specific functions, it would be also possible to split a solution according certain constraints. An integration of such functionality into ML-Rules is planned for the near future.
Mating type switching ensures that - in the long run - both types are equally present in a population of cells. An initial population consisting of only one type of cells nicely shows distinct time points of the first appearance of cells that comprise other combinations of mating type and the ability to switch (Figure 6B). Also the calibration to equally distributed cell types after just a few division cycles can be observed.
Besides the restriction to cells of opposite mating types, conjugation of fission yeast cells is also regulated by diffusible pheromone molecules . When growing in a nitrogen-poor environment, cells are starving and begin to synthesize mating type specific pheromones that are secreted to the extracellular medium. The pheromone secreted by cells of mating type P is called P-factor and M-type cells produce the M-factor pheromone. Fission yeast cells of different mating type are able to communicate with each other via pheromone molecules. Sensing of pheromones released by the opposite type causes several regulation processes that prepare the cells for mating. One of the main effects is an arrest of the division cycle at the G1 phase . We would like to extend our cell model by adding communication via pheromone molecules and the respective responses so that a G1 arrest can be observed.
M-factor molecules may then influence dynamics of P-type cells in the same solution. Conversely, cells of mating type P may communicate with M-type cells via P-factor molecules (F P ). In addition to the M-factor, cells of type M produce and release a P-factor-specific protease (Sxa2) which lowers the effect that P-type cells have on M cells [59, 61].
In fact the above rule includes two different downward causalities at the same time. The first one is the amount of extracellular pheromone, which is included in the rate factor describing a Hill type sigmoidal response curve for MPF repression. The second downward causation is a volume-dependence again. This reflects the observation that inhibition of MPF activity is partly lost due to increasing cell size , which could, for instance, be a consequence from a dilution of involved (but here not regarded) enzymes.
The single-cell simulation experiments given in Figure 7 show how the added reaction rules influence the intracellular processes and by that have an effect on the dynamics at cell level, i.e. progression through the cell cycle phases. As pheromone secretion and mating takes place when nutrition is poor, the mass-doubling time T d has been increased to 232 minutes. Without pheromone, the cell cyle length then increases to roughly 200 minutes (Figure 7B). Similar dynamics can be observed with a low amount of extracellular pheromone (Figure 7C). The amount of repressed MPF molecules is not enough to have a significant effect on MPF activation. This is different with a higher pheromone concentration. Figure 7D indicates a strong suppression of inactive MPF by the repressed variant. With increasing cell size (not shown), repression gets partly lost, i.e. the cell adapts while it grows and completes the cell cycle finally after more than 600 minutes. Please notice the dramatically increased duration of the G1 phase while S/G2 and M phase take only slightly more time than without pheromone sensing.
The model so far assumes all cells as well as each secreted pheromone molecule residing in the same solution, i.e. there is no distinction between different locations. This assumption might be appropriate in many cases. However, especially when it comes to modeling of multicellular systems comprising of communication either via direct cell-to-cell interactions or via diffusible molecules, capturing different species locations might be important . Therefore, to investigate cell division and pheromone signaling in an inhomogeneous solution, we extend the model by some simple spatial dynamics covering pheromone diffusion and different locations of cells.
We adopt the idea of the Next Subvolume Method  to add space in a discretized manner. Rules and reaction rates are responsible to describe reactions between molecules and their diffusion into another voxel in the spatial grid. A new attributed species G is introduced, which represents virtual reaction compartments within a two-dimensional grid. Each voxel G may comprise a solution of cells and pheromone molecules with a homogeneous distribution like before, but species may migrate to adjacent voxels according to certain rules (see Figure 4 for a schematic description of the spatial setting).
with nb(x 1 , y 1 , x 2 , y 2) = if (x 1 = x 2 ∧ (y 1 = y 2 + 1 ∨ y1 = y 2 - 1)) ∨ (y 1 = y 2 ∧ (x 1 = x 2 + 1 ∨ x 1 = x 2 - 1)) then true else false.
An elegant alternative to the above strategy would be the already discussed concept of applying functions to solutions, so that the amount of cells in a given solution can simply be counted. In any case, the examples show how spatial effects like diffusion and excluded volume can be modeled in an ad hoc way, although ML-Rules has been developed without explicit notions of space. Approaches which are aimed at spatial rule-based modeling explicitly deal with such problems and emphasize the need for describing spatial phenomena for larger entities [64, 65].
Simulation of population growth within the described spatial setting reveals that the overall ratio between the different mating types remains nearly constant over time. However, local differences in the amount of P and M type cells can be observed (see additional file 2).
The aim of this paper has been to identify essential concepts for rule-based multi-level modeling, to present their realization in ML-Rules, and to show their role based on a case study. ML-Rules could built on ideas developed in a rule-based approach for multi-level modeling of ecological systems  where attributes, components of specific types, and interfaces are assigned to individual entities. The approach combines hierarchical nesting and the description of constrained dynamics in terms of rule schemata. ML-Rules shares also central features with the rule-based formalism React(C), which supports molecules with attribute values of any type and reaction constraints to flexibly define reaction rates . Being of arbitrary type, attributes can also encode solutions and thus hierarchically nested entities. However, React(C) has no notion of nesting: rules cannot be applied to a solution nested within an entity. Instead, rules can only be applied to an entity attributed with a specific solution, i.e. top-down. Considering that nested hierarchies may be dynamically changed, e.g. in models describing vesicles that fuse with membranes, this limits the expressive power of React(C) with respect to multi-level modeling.
Recently, Oury and Plotkin have presented a stochastic multi-level multiset rewriting language , in which rules can be applied to nested species to support multi-level modeling in systems biology. However, downward and upward causation cannot easily be expressed, because attributes and corresponding constraints on reactions are not yet supported. However, this is announced among the next steps to do.
As has been already discussed, ML-Rules does not provide an explicit notion of linkage, e.g. to describe bonds within protein complexes. Hierarchical graphs with multiple edge types offer a natural and explicit representation of such bindings within hierarchical model structures. In this context, a generalized graph isomorphism and labeling algorithm like HNauty  may be of particular importance. Although originally developed for the structured annotation of flat rule-based models, hierarchical graphs and the HNauty algorithm are promising techniques for the development of an efficient multi-level approach based on graph-rewriting rules.
Spatial structuring of models shares general concepts with multi-level modeling as different levels are defined by a separation from each other in a broader sense. In systems biology, the most common representation of space is realized by simple compartmentalization, i.e. by separating different chemical solutions from each other and allowing for basic transport rules to change the location of molecules, see e.g. BIOCHAM  and little b . More sophisticated capabilities for membrane-mediated transport and interaction rules are supported by cBNGL , an extension of the original BioNetGen language . Structures and rules in cBNGL are tightly coupled with the concept of compartments and membranes, e.g. the language distinguishes between three-dimensional (compartment volume) and two-dimensional (surface, i.e. membrane) compartments.
Other approaches focus on supporting dynamic compartment structures. BioAmbients , for example, which is based on the π-calculus , supports wrapping of processes by so called ambients. Both, processes and ambients, are allowed to enter or exit other ambients and two ambients are allowed to merge into a single one. Similarly, the bioκ-calculus  also allows to fuse multiple membranes resulting in a single compartment. It aims at combining rule-based modeling with dynamic membrane formalisms like the Brane Calculi  and P systems . However, although rooted in the rule-based domain, bioκ shows limited expressiveness for modeling biochemical systems compared to other rule-based languages, e.g. the κ-calculus . Another rule-based formalism with explicit means for dynamic nested model structures is the Calculus of Wrapped Compartments [76, 77]. None of these approaches equips compartments with a state and a behavior of their own; dynamics at the level of compartments are initiated by the enclosed processes or rules in a "bottom up" way.
Bigraphs  is different from the above formalisms as there is no distinction between structural and behavioral elements of a model and thus the approach pursued is rather similar to ML-Rules. Each node of a bigraph may be enclosed by another node and may contain further nodes itself. So, nesting is an inherent property of the bigraphical components. Equipped with a stochastic semantics , reactive bigraphs have been successfully applied for modeling cell biological systems. However, the state of a node is defined by its linkage to other nodes only. Also the lack of constraints on reactions limits the modeling of inter-level causalities.
Beyond ordinary compartment-like spatial approaches (to which we would also count ML-Rules, although its expressiveness widens the applicability for describing other spatial relationships as well), diverse methods have been developed for modeling and simulation of more complex spatial phenomena. Smoldyn, for example, supports simulation of spatial compartments with diffusing molecules, membrane interactions, and excluded volume effects . The latter is also an important feature of ML-Space, a modeling and simulation approach that supports hierarchical nesting and combines population-based reaction-diffusion systems with individual particles for representing different spatial resolutions . Meredys is another simulator that supports reaction-diffusion events taking place in multiple compartments. However, the main feature of Meredys is that molecules and molecular complexes may have realistic shapes in two and three dimensions .
Multi-level models that comprise a wide range of spatial scales, e.g. from the molecular to tissue or even organ scale, often need to consider different spatial relationships at different levels. For example, while at the molecular level well-stirred compartments or heterogenously distributed reaction-diffusion systems are appropriate representations, modeling the dynamics at tissue level might need to take physical mechanics of interacting cells into account. The strong diversity in applied methods is one reason why such multi-scale models typically lack a unifying formal modeling language. Instead, different model parts are described and interpreted differently. To integrate these different parts efficiently, either monolithic mixtures of model descriptions and simulators are programmed from scratch or specialized multi-scale software platforms are used that have been developed for certain applications, see e.g. [82–86].
MGS provides integration of explicit descriptions of space in a generic uniform setting [87, 88]. The approach combines rule-based modeling with topological collections to specify which and how model entities may interact with each other. Various topological collections define different local relationships of individual entities, e.g. ordinary multisets or Delaunay triangulation. Although MGS does not have an inherent notion of nesting, its underlying concepts allow to describe multi-level models in a versatile manner and across various spatial scales.
The need to describe systems at different levels has also been addressed by Petri nets approaches, see e.g. [89–91]. For instance, HORNETS (Higher Order Reference Nets) is a formalism that allows to have Petri nets as tokens of Petri nets . However, HORNETS are executed in equidistant time steps as they are aimed at modeling software systems, e.g. workflows, rather than biochemical systems.
Rule-based languages are a suitable starting point for developing a concise and compact language for multi-level modeling of cell biological systems. Therefore, a combination of concepts, part of which are already well established, can be exploited.
Rule schemata help reducing the size of models and equally important, add the required flexibility to express dynamics at different levels in a general manner. Nesting species, assigning attributes to these species, and constraining reactions according to attributes have been identified as further essential ingredients in supporting multi-level modeling. Species are described by attributes and the species they contain. Both of which might constrain rules (due to functions and conditional expressions) or be altered by them. Thereby, the boundaries of levels might be crossed.
How dynamics at different levels can be described in a rule-based approach has been shown with a model of fission yeast to analyze the regulations between cell cycle control, cell division, mating type switching, and cell-cell communication via diffusible pheromone molecules.
The concepts have been realized in ML-Rules which has been implemented in JAMES II. The use of a plug-in-based modeling and simulation framework has allowed a rapid prototyping of a suitable modeling and simulation environment for our experiments. However, the current simulator is a prototype and realizes a purely stochastic discrete event approach. Although JAMES II offers a coarse grained parallel execution - e.g. to speed up multiple simulation runs - already the single run execution of a more complex ML-Rules model, like the presented multicellular fission yeast model, takes a significant amount of time in the current implementation. This is due to the expressiveness of ML-Rules which requires specific effort to keep calculation costs at bay. Thus, the next steps with respect to implementation will be to look into exploiting different variants of the SSA algorithm, e.g. the optimized direct method, speeding up the matching of reactants, and exploring the potentials of an alternative Monte Carlo method as well as hybrid approaches.
From the modeling point of view, the developed concepts shall be put to test in concrete applications that might be difficult to describe with currently available modeling approaches. For example, potential application areas for ML-Rules are various systems where the relation between intracellular and intercellular dynamics play a role, e.g. quorum sensing, tumor growth, and plant root growth. The presented approach appears also suitable for modeling dynamic processes with multiple membrane bound compartments, like endocytosis, active vesicle transport along cytoskeletal filaments, and processes at the Golgi apparatus.
This work has been supported by the Deutsche Forschungsgemeinschaft (DFG) as part of the project DiER MoSiS (Discrete Event Multi-level Modeling and Simulation for Systems Biology) and the research training group dIEM oSiRiS (Integrative Development of Modeling and Simulation Methods for Regenerative Systems). We would like to thank Mathias John for constructive comments during developing this paper.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.