# Table 2 Mathematical model for data generation

Reactor model equations:
$\frac{dX\left(t\right)}{dt}=X\left(t\right)·\mu \left(W\left(t\right)\right)+D·X\left(t\right)$ $\frac{dS\left(t\right)}{dt}=-{r}_{\text{s}}\left(S\left(t\right)\right)·X\left(t\right)-D·\left(S\left(t\right)-{S}_{F}\right)$
$\frac{dP\left(t\right)}{dt}={r}_{\rho }\left(W\left(t\right)\right)·X\left(t\right)-D·P\left(t\right)$ $\frac{dV\left(t\right)}{dt}=F\left(t\right)$
$\frac{dW}{dt}=\frac{Z-W}{\beta }$ $\frac{dZ}{dt}=\frac{S\left(t\right)-Z}{\beta }$
$F\left(t\right)=\left(\frac{V\left(t\right)}{{S}_{F}-S\left(t\right)}\right)·\left({r}_{\text{s}}·X\left(t\right)+\frac{{S}_{set}-S\left(t\right)}{{\tau }_{set}}\right)$ $D=\frac{F}{V}$
Cell model equations:
$\mu ={K}_{B\mathit{1}}·\frac{W\left(t\right)}{{K}_{s}+W\left(t\right)}-{K}_{B\mathit{2}}·{m}_{ATP}$ ${r}_{\rho }={K}_{\rho \mathit{1}}·\mu +{K}_{\rho \mathit{2}}$
${r}_{S}={r}_{S,max}\cdot \frac{S\left(t\right)}{{K}_{s}+S\left(t\right)}$ $W\left(t\right)={\int }_{-\infty }^{t}\left(S\left(t-\tau \right)/{\beta }^{2}\right).\tau ·\mathrm{exp}\left(-\tau /\beta \right)·d\tau$
1. Parameters and initial Values:
2. D ,-, (1/h); F ,-, (g/l); S set ,10, (g/l); K B1 ,0.1184, (1/h); K B2 ,4.7376, (g/mol); K P1 ,0.48, (-); K P2 ,0.0008, (1/h); K s ,10, (g/l); m ATP ,0.0015, (mol/(g.h)); P ,0, (mg/l); r s, max ,0.19, (1/h); S ,40, (g/l); S F ,1260, (g/l); t ,-, h ; V ,15, l ; W , W0 = S0 , (g/l); X ,1, (g/l); Z , Z0 = S0 , (g/l); τ set ,1, h ; β, h; μ ,-, (1/h);
3. Mathematical model of MUT+ Pichia pastoris expression with a quadratic distributed delay kernel. This model was used to generate six data sets. Three of which contain the clean, noise-free data and the other three the associated white noise corrupted data. One data set of the noise corrupted sets was used to train the hybrid model, one was used for validation and the third one for testing. Integration was performed with the ode45 MATLAB function which integrates the differential equation with a Runge-Kutta (4,5) integration scheme. The obtained state variables, namely concentrations of biomass, substrate and product, the reactor volume and as well the feed concentration are recorded and assumed as measured data for the evaluation. Variation in the data was obtained by application of varying initial values, i.e. the initial values were 5% Gaussian distributed. Note that model equations (A5 and A6) are derived from equation (A 12) using the linear chain trick [17, 18] and that (A 12) is never used for model calculations.