Transcription Factors need to locate their target sites on the DNA within a time frame that is shorter than can be achieved by random diffusion. The search process is further complicated by the fact that target sites are usually similar to a significant number of other sites (decoys), and by the fact that there are other molecules searching for their target sites simultaneously. To understand transcriptional regulation better, it is therefore essential to have a complete understanding of the mechanistic way in which this search process takes place.

In the last 40 years, both theoretical and experimental research were able to identify that the search mechanism is a combination of a three-dimensional diffusion and a one-dimensional random walk, which is often referred to as the *facilitated diffusion* mechanism [1–6]. Despite considerable progress and mainly due to the technical limitations [7], there is still a significant gap in our understanding of how TFs locate their target sites [8]. One issue is the way in which the TF performs the one-dimensional random walk, in the sense that there is still no consensus whether the TF molecules predominantly slide (do not lose contact with the DNA during the one-dimensional random walk) [9–11] or hop (perform small jumps on the DNA during the one-dimensional random walk) [7, 12]. Another example is the disagreement between the values for the proportion of time that TF molecules spend on the DNA: analytical computations of an optimal search process [13] differ from values measured experimentally [4].

One way to address these questions are stochastic simulations of the facilitated diffusion mechanism [14–16]. In [17, 18] we proposed a computational model, GRiP, that allows genome-wide simulation of the facilitated diffusion mechanism. In particular, the CPU time required to simulate 1 *s* of an *E.coli* K-12 cell and lac repressor (lacI) TFs in GRiP resides between 1 *h* and 4 *h* on a 2×2.26 GHz quad-core Intel Xeon MacPro computer, see [17].

Despite the significant speed-up compared to previous tools, it is still not feasible to use the full genomic sequence as a search space. To address a scientific question with GRiP, multiple simulations need to be performed to allow a meaningful statistical analysis of the results. Thus, even small improvements in simulation speed can add up to some significant time saving. The optimization of the algorithm or of the implementation can potentially increase the speed of the simulations, but this is limited by the level of detail in the simulated model. In addition, even in the case of significant algorithm optimisations, simulating eukaryotic systems that have more than 100 *Mbp* and 10^{7} TFs becomes impractical.

One strategy to increase simulation speed consists of system size reduction, following the logic that the properties of the search process are the same irrespective of simulating only a subset or the full genomic sequence. However, this requires a few simulation parameters to be adapted to the size of the subsystem (e.g. the number of TF molecules in the subsystem as compared to the full system). This change in system parameters is required in order to avoid biases in the results, e.g. TFs could locate the target sites faster or target sites might be occupied for longer time intervals if there is an inappropriate number of TFs. The main advantage of this approach is that smaller systems will display faster speeds due to smaller DNA regions and, consequently, due to lower number of molecules bound to the DNA which perform the one-dimensional random walk.

Our results indicate that if the diffusion parameters are conserved and if the proportion of covered DNA is similar for the original system and the subsystem, then the subsystem captures the dynamic and steady state behaviour of the original system with negligible error.

In this contribution, we present two adaptation methods (the copy number method and the association rate method) that managed to keep the simulation results for the full system and the subsystem constant. We systematically investigate the degree to which the simulation results are affected when reducing the size of the system. The first method (*copy number method*) is simpler to implement, but is limited with respect to how much the system can be reduced and in terms of accuracy. This is caused by the fact that TF copy numbers are integers and values lower than 1 cannot be considered while intermediary values need to be rounded to the closest integer value.

The second approach, the (*association rate method*), is slightly more difficult to implement (it requires to measure the proportion of time the molecules spend on the DNA *a priori*), but surpasses all the limitations of the previous method (higher accuracy and the size of the smallest subsystem is not limited by TF copy number any more).

Overall, we show that copy number method performs well in the case of high abundance TFs, while for low abundance TFs, one needs to rely on the association rate method.