Skip to content

Computer Lab 3: Systems Biology Modeling

Authors:
Elin Nyman
Henrik Podéus Derelöv

Updated by:
Ralph Monte

Last updated: 2026-01-21

Throughout the lab you will encounter different "admonitions", which is a type of text box. A short legend of the different kinds of admonitions used can be found in the box below:

Different admonitions used (click me)

Background and Introduction

Useful information

Guiding for implementation

Here you are tasked to do something

Reflective questions

To begin with, you will install the packages deSolve and FME. You can read more about these packages if you want to.

Installations

The installation of packages in R can sometimes be problematic due to dependencies between different versions of libraries. If you encounter problems, try to understand the error messages.

# Install FME and deSolve
install.packages("deSolve")
install.packages("FME")
# Load packages
library(deSolve)
library(FME)

Contents

  1. Implement and simulate a simple model
  2. Get familiar with the data and problem formulation
  3. Implement the hypotheses
  4. Test the hypotheses
  5. Compute model uncertainties
  6. Model predictions
  7. Assignment

Implement and simulate a simple model

Here we will learn the basics of how to create models, how to simulate models, and how to plot model simulations.

Implement, simulate, and plot a simple model

Here is an example were we implement a model with a single state \(A1\) with an initial amount of 1 (\(A1(0)=1\)). \(A1\) increases over time with a constant rate (\(k1 = 0.5\)). Let the amount of \(A1\) correspond to the measured amount, so that the measurement equation is \(\hat{y}=A1\).

Example code
## Define ordinary differential equations for the model
simplemodel <- function(time, states, parameters) {
  with(
    as.list(c(states, parameters)),
    {
      dA1 <- k1
      return(list(c(dA1)))
    }
  )
}

## Define time steps for the simulation
times <- seq(0, 10, 0.1)

## Define initial conditions (A1(0) = 1)
states <- c(A1 = 1)

## Define parameters
parameters <- c(k1 = 0.5)

## Simulate the model using ode
simA <- ode(
  y = states,
  times = times,
  func = simplemodel,
  parms = parameters
)

## Plot results
plot(
  simA,
  xlab = "time (min)",
  ylab = "A1 (a.u.)",
  col  = "pink",
  lwd  = 2,
  main = "First graph simple model"
)

## Create a function that simulates the model and returns a data frame
simModel <- function(parameters) {
  as.data.frame(
    ode(
      y = states,
      times = times,
      func = simplemodel,
      parms = parameters
    )
  )
}

## Plot results
simB <- simModel(parameters)

plot(
  simB,
  type = "l",
  xlab = "time (min)",
  ylab = "A1 (a.u.)",
  lwd  = 2,
  col  = "red",
  main = "Second graph simple model"
)

Now change the simple model so that \(A1\) instead decreases with the reaction rate \(k1 \times A1\) (using mass-action kinetics). Use the initial condition \(A1(0)=4\).

If you get errors later, it can be helpful to go back here to study and simulate and play around with these simple models.

Reflect over the following questions before continuing.

Checkpoint 1
  • Do the simple models behave the way you expect them to?
  • Do the simulations change if you vary the initial condition \(A1(0)\) and/or the rate constant, \(k1\)?

Get familiar with the data and problem formulation

Here we will read about the biological system and load and plot the data.

The biological system and data

We take on the role of a system biologist working in collaboration with an experimental group. The experimental group is focusing on the cellular response to growth factors, like the epidermal growth factor (EGF), and their role in cancer. The group has performed an experiment were they stimulate a cancer cell line with EGF and use an antibody to measure the tyrosine phosphorylation of the EGF receptor at site 992, which is known to be involved in downstream growth signalling. The data from the experiment show an interesting dynamic behaviour with a fast increase in phosphorylation followed by a decrease down to a steady state level which is higher than before stimulation, see below:

Experimental Data

The experimental group does not know the underlying mechanisms that give rise to this dynamic behaviour. However, they have two different ideas of possible mechanisms, which we will formulate as hypotheses below.

To generalize this problem, we use the following notation:
- A (Activator) EGF
- R (Receptor) EGFR
- S (receptor Substrate)
- p (phosphorylation)

The first idea is that there is a negative feedback transmitted from the receptor to a downstream signalling molecule (S) that is involved in dephosphorylation of the receptor.

Hypothesis 1

The second idea is that a downstream signalling molecule is secreted to the surrounding medium and there catalyses breakdown of EGF, which also would reduce the phosphorylation of EGFR.

Hypothesis 2

The experimental group would like to know if any of these ideas agree with their data, and if so, how to design experiments to find out which is the more likely idea.

To begin with, let us load and plot the experimental data.

Load and plot the experimental data

Now create a data frame that contain the observations (values are given in the code chunk below). Time for measurements, mean values, and standard error of the mean (sem) are included. Note that mean values must have same notation as corresponding state in the model (e.g. Rp, EGFRp, x2). Plot the data and see that your plot correspond to the figure above.

## Create a data frame that contain the observations (time, mean values and sem values of Rp). Note that mean values must have same notation as corresponding state in the model.
yobs <- data.frame(
  time = c(seq(0, 20, 2)),
  Rp = c(
    0.0325, 0.2325, 0.3350, 0.3010, 0.2550, 0.1885, 0.1395,
    0.1750, 0.1415, 0.1435, 0.0950
  ),
  sem = c(
    0.028, 0.035, 0.059, 0.051, 0.035, 0.036, 0.039,
    0.067, 0.048, 0.037, 0.048
  )
)

## Plot the data including sem
plot(
  yobs$time,
  yobs$Rp,
  xlab = "time (min)",
  ylab = "Rp (EGFRp992Y)",
  ylim = c(0, max(yobs$Rp + yobs$sem))
)
lines(
  yobs$time,
  yobs$Rp,
  type = 'p',
  pch = 19,
  cex = 1.5
)
arrows(
  yobs$time,
  (yobs$Rp + yobs$sem),
  yobs$time,
  (yobs$Rp - yobs$sem),
  length = 0.05,
  angle = 90,
  code = 3
)

Implement the hypotheses

Now that we know how to simulate a simple model, let us implement the hypotheses as models, and simulate them.

The first hypothesis

To go from the first idea to formulate a hypothesis, we include what we know from the biological system:

  • A receptor (R) is phosphorylated (Rp) in response to a stimulation of the activator (A)
  • A downstream signalling molecule (S) is activated as an effect of the phosphorylation of the receptor
  • The active downstream signalling molecule (Sp) is involved in the dephosphorylation of the receptor
  • Phosphorylated/active molecules can return to inactive states without stimuli

From this knowledge, we can draw an interaction graph as depicted below

Interaction Graph

Now we can continue to implement the first hypothesis.

Implement, simulate, and plot the first hypothesis

1. Formulate equations based on the interaction graph.

2. Enter the equations for model1 in your R code.

Use the following parameter values:

Initial values Parameter values
R(0) = 1 k1 = 1
Rp(0) = 0 k2 = 0.001
S(0) = 1 kfeed = 100
Sp(0) = 0 k4 = 0.01
A(0) = 1 k5 = 0.01

3. Simulate and plot a simulation of model1 using these parameter values.

For your guidance is an example with a simpler version of the model included below.

Example code

Note that the derivatives for \(S\) and \(Sp\) is not given.

## Define initial conditions
states1 <- c(
  R = 1,
  Rp = 0,
  S = 1,
  Sp = 0,
  A = 1
)

## Define time steps
times <- c(seq(0, 20, 0.1))

## Define ordinary differential equations for the model
model1 <- function(time, states1, parameters1) {
  with(
    as.list(c(states1, parameters1)),
    {
      dR = -k1 * A * R + k2 * Rp
      dRp = k1 * A * R - k2 * Rp
      dS = 0
      dSp = 0
      dA = 0
      return(list(c(dR, dRp, dS, dSp, dA)))
    }
  )
}

## Simulate the model
simModel1 <- function(parameters1) {
  return(
    as.data.frame(
      ode(
        y = states1,
        times = times,
        func = model1,
        parms = parameters1
      )
    )
  )
}

pstart1 <- c(
  k1 = 1, # You need to update this list of parameters according to your hypothesis
  k2 = 1e-3
)

sim1 <- simModel1(pstart1)

plot(
  sim1$time, 
  sim1$Rp, 
  type = "l", 
  xlab="time (min)", 
  ylab = "Rp (fraction)", 
  col = 'violet', 
  lwd = 2
)

For the first model, a simulation of the state Rp should look something like this:

Correct simulation of Rp

Model 1 Simulation

4. Plot the model simulation together with data. Look at the agreement between model simulation and data.

Code to plot the simulation and data
# Plot data
plot(
  yobs$time,
  yobs$Rp,
  xlab = "time (min)",
  ylab = "Rp (fraction)",
  ylim = c(0, max(sim1$Rp))
)
lines(
  yobs$time,
  yobs$Rp,
  type = 'p',
  pch = 19,
  cex = 1.5
)
arrows(
  yobs$time,
  (yobs$Rp + yobs$sem),
  yobs$time,
  (yobs$Rp - yobs$sem),
  length = 0.05,
  angle = 90,
  code = 3
)

# The plot of the simulation
lines(
  sim1$time,
  sim1$Rp,
  col = "violet",
  lwd = 2
)

Now let us move on the the second hypothesis.

The second hypothesis

Again, to go from the idea to formulate a hypothesis, we include what we know from the biological system:

  • A receptor (R) is phosphorylated (Rp) in response to a stimulation with the activator (A)
  • A downstream signalling molecule (S) is activated as an effect of the phosphorylation of the receptor
  • The active downstream signalling molecule (Sp) is catalysing breakdown of the activator (A)
  • Phosphorylated/active molecules can return to inactive states without stimuli

From this knowledge, we can draw an interaction graph as depicted below

Second Hypothesis Interaction Graph

Implement, simulate, and plot the first hypothesis

1. Formulate equations based on the interaction graph.

2. Enter the equations for model2 in your R code. Create a different function for model2 instead of overwriting model1 so you can use both models forward.

Use the following parameter values:

Initial values Parameter values
R(0) = 1 k1 = 1
Rp(0) = 0 k2 = 2
S(0) = 1 kfeed = 0.1
Sp(0) = 0 k4 = 15
A(0) = 1 k5 = 30

3. Simulate and plot a simulation of model2 using these parameter values.

4. Plot the model simulation together with data. Look at the agreement between model simulation and data.

Reflect over the following questions before continuing.

Checkpoint 2
  • Look at your graphs for both the first and the second hypotheses.
  • Are the agreements with data convincing?
  • If not, what are the problems?
  • Can we test quantitatively if the agreements between model simulations and data are good enough?

Test the hypotheses

Here we will learn how to optimize parameter values for the best possible agreement between model simulations and data.

Parameter estimation

To test hypotheses, we need to identify the parameters that give the best possible agreement between model simulations and data. This process is known as parameter estimation. In parameter estimation, an objective or cost function evaluates the agreement between the model simulation and the data, given the current parameter values, \(p\).

Cost function - \(V(p)\):

\[ V(p) \;=\; \sum_{t=1}^{N}\frac{\bigl(y_{t}-\widehat{y}_{t}(p)\bigr)^{2}}{\mathrm{SEM}_{t}^{2}} \]

An optimization algorithm explores the parameter landscape in the search for the global optimum. The method used as standard in the examples here is the L-BFGS-B algorithm (Limited-memory Broyden–Fletcher–Goldfarb–Shanno with Bound constraints). L-BFGS-B is a quasi-Newton method that supports box-constrained optimization, making it well suited for parameter estimation where parameters have lower and upper bounds. It approximates the Hessian matrix using only a limited amount of recent gradient information, which makes it efficient even for problems with many parameters (read more in e.g. Wikipedia if you are interested).

We will use the functions modCost and modFit from the FME library to define our cost function and objective function call.

Prepare parameter estimation

We will use optimization methods to find the best parameter values that fit the experimental data.

## Define a cost function using modCost, use standard error of the mean, sem, as weight
modelCost <- function(p) {
  out <- ode(
    func = model,
    y=states, 
    parms = p, 
    times = yobs$time
  )
  return(modCost(
    model=out, 
    obs=yobs, 
    err='sem'
  ))
}

## Fit the model to the observations, using modFit with the cost function above and a starting guess for the parameters
fit <- modFit(
  f = modelCost, 
  p = pstart,
  method = "L-BFGS-B"
)

Now, we will explore the parameter space for the hypotheses.

Test the first hypothesis

Search for the best parameters for the first model. You can play around with the starting guess of parameter values and the parameter boundaries. Below you find help on how to use the functions in the FME package.

Given code
## Define a cost function using modCost
modelCost1 <- function(p) {
  out <- ode(
    func = model1,
    y = states1,
    parms = p,
    times = yobs$time
  )
  return(modCost(
    model = out,
    obs = yobs,
    err = 'sem'
  ))
}

## Test different start guesses for the parameter values
pstart1 <- c(
  k1 = 1,
  k2 = 1e-3,
  kfeed = 1e2,
  k4 = 0.01,
  k5 = 0.01
)

## Define boundaries for the parameters
plower1 <- c(
  k1 = 1e-4,
  k2 = 1e-4,
  kfeed = 1e-4,
  k4 = 1e-4,
  k5 = 1e-4
)

pupper1 <- c(
  k1 = 1e3,
  k2 = 1e3,
  kfeed = 1e3,
  k4 = 1e3,
  k5 = 1e3
)

## Fit the model (use modFit) to the observations (add control = list(nprint = 1) as argument to show intermediate results of the fitting)
fit1 <- modFit(
  f = modelCost1,
  p = pstart1,
  upper = pupper1,
  lower = plower1,
  method = "L-BFGS-B"
)

## Simulate Model 1 with the best found parameters and plot results
sim1best <- simModel1(fit1$par)

## Plot best results together with data
plot(
  yobs$time,
  yobs$Rp,
  xlab = "time (min)",
  ylab = "Rp (fraction)",
  ylim = c(0, max(yobs$Rp + yobs$sem))
)
lines(
  yobs$time,
  yobs$Rp,
  type = 'p',
  pch = 19,
  cex = 1.5
)
arrows(
  yobs$time,
  (yobs$Rp + yobs$sem),
  yobs$time,
  (yobs$Rp - yobs$sem),
  length = 0.05,
  angle = 90,
  code = 3
)
lines(
  sim1best$time,
  sim1best$Rp,
  col = 'violet',
  lwd = 2
)

## Compute if model is good enough (chi2-test)
cost1 <- modelCost1(fit1$par)
dgf = length(yobs$Rp)
threshold = qchisq(.95, df = dgf)
print(paste("Chi2 threshold:", threshold))
print(paste("Model cost:", cost1$var$SSR))
print(paste("Model cost > Chi2 threshold (rejected?):", cost1$var$SSR > threshold))

Now let us move on to perform the parameter estimation for the second hypothesis.

Test the second hypothesis

Do the same analysis for the second hypothesis.

Reflect over the following questions before continuing.

Checkpoint 3
  • Are the agreements between model simulations and experimental data better after parameter estimation?
  • Can you reject any of the models with a chi2-test?

Compute model uncertainties

We will be using the Markov chain Monte Carlo (MCMC) approach to compute the model uncertainties.

Markov chain Monte Carlo (MCMC)

In MCMC, parameter values are generated in a random walk process, where each step depends on the previous one. Parameters that give a low cost are more likely to be kept. If a tested parameter is kept, this parameter is used to compute the next parameter. Thus MCMC stays for longer in regions of acceptable parameters. All kept parameters are used to approximate the model uncertainty. When results are simulated, each line correspond to a model simulation with a parameter found using the MCMC approach. These lines together approximates the uncertainty of the model. This uncertainty is important to consider when using the model to make predictions of unknown properties/experiments.

Here you will be given an example of how to compute the model uncertainty.

Compute model uncertainty for the first hypothesis

Use the code below as help when you compute and plot model uncertainties.

Given code
## Use MCMC to explore model uncertainty (modMCMC). Include the previously created cost function as  Change niter to explore results of a shorter and/or longer chain.
MCMC1 <- modMCMC(
  f = modelCost1,
  p = fit1$par,
  lower = plower1,
  upper = pupper1,
  niter = 1000
)

## If you want to, plot the results of MCMC runs
## plot(MCMC1, Full = TRUE)
## hist(MCMC1, Full = TRUE, col = "darkblue")

## Filter MCMC results to include only parameter sets with good fit to data.
## Calculate the cost for each parameter set and keep only those below the chi2 threshold.
dgf <- length(yobs$Rp)
threshold <- qchisq(.95, df = dgf)
costs1 <- apply(MCMC1$pars, 1, function(p) modelCost1(p)$var$SSR)
acceptable_idx1 <- which(costs1 < threshold)
acceptable_pars1 <- MCMC1$pars[acceptable_idx1, ]

## Compute uncertainty for the simulations using only acceptable parameter sets
sR1 <- sensRange(
  parInput = acceptable_pars1,
  func = simModel1
)

## Plot a single state with all accepted simulations from the MCMC run
plot(
  sR1, 
  obs = yobs, 
  xlab="time (min)", 
  which = "Rp", 
  ylab = "Rp (fraction)", 
  main = "Uncertainty of the model"
)

## Plot all states with visualized uncertainties (max/min simulations and +/- the computed standard deviation of the simulations)
plot(
  summary(sR1), 
  quant = FALSE
)

Now let us also compute the model uncertainty for the second hypothesis.

Compute model uncertainty for the second hypothesis

Do the same analysis for the second hypothesis.

Reflect over the following questions before continuing.

Checkpoint 4
  • Do you think you have found good approximations of the uncertainties of the models?
  • How can you test if you have found good approximations?

Model predictions

To try to distinguish between the two hypothesis we will now try to find predictions were the two hypotheses differ. If we find an experimental design that one or both of the hypotheses is unable to describe we can than reject at least one of the hypotheses.

Predictions with the first hypothesis

What would happen if we add the same dose of EGF again after 15 minutes? We can perform those simulations to see if the hypotheses differ in their prediction of such an experiment.

Here you can play around with different experimental designs by changing input strengths and/or simulations lengths. Start with the first hypothesis.

Given code
## Simulate new prediction for Model 1 (longer time, add A again after 15 time units)
simModel1prediction <- function(parameters1) {

  ## make a first simulation for 15 minutes
  sim15 <- ode(
    y = states1,
    times = c(seq(0, 15, 0.1)),
    func = model1,
    parms = parameters1
  )

  ## update state values after first simulation
  statesNew <- c(
    R = tail(sim15[, 2], 1),
    Rp = tail(sim15[, 3], 1),
    S = tail(sim15[, 4], 1),
    Sp = tail(sim15[, 5], 1),
    A = 1
  )

  ## make a second simulation after the first for 5 minutes
  sim5 <- ode(
    y = statesNew,
    times = c(seq(0, 5, 0.1)),
    func = model1,
    parms = parameters1
  )

  ## update time in the simulation (add 15 minutes to second simulation)
  sim5[, 1] <- sim5[, 1] + 15

  return(rbind(as.data.frame(sim15), as.data.frame(sim5)))
}

sRpred1 <- sensRange(
  parInput = MCMC1$pars,
  func = simModel1prediction
)
plot(
  sRpred1,
  xlab = "time (min)",
  which = "Rp",
  ylab = "Rp (fraction)",
  main = "Prediction Rp model 1"
)
plot(
  sRpred1,
  xlab = "time (min)",
  which = "A",
  ylab = "Rp (fraction)",
  main = "Prediction A model 1"
)

Now do the same prediction for the second hypothesis.

Predictions with the second hypothesis

Do the same analysis for the second hypothesis.

Reflect over the following questions before continuing.

Checkpoint 5
  • What can you conclude from the predictions?
  • Would you recommend your experimental collaborator do the suggested experiment?

Homework Exercise

Now, it is time for you to do your own analysis with inspiration from the above examples.

Your own hypothesis

Come up with your own explanation to the data. Provide the interaction graph together with your motivation. Perform all steps of the analysis above. This time, provide a graph where data is compared to the model simulations - including the model uncertainty.

You can for example assume that the activated receptor is internalized into the cell with a slow return back to the plasma membrane. Such a model could give the below result.

Data, simulation, and uncertainty

Hand-in:

  • Provide the interaction graph and the corresponding ODE model for your hypotheses (include your assumptions).
  • Provide a graph with model simulations including uncertainty compared with data.
  • Provide the code to reproduce your figure(s).
  • Suggest an experiment that would be useful to test your hypothesis and motivate your answer. Complement your answer with corresponding model-based simulation of the experiment if you want to.
  • Why is it important to take the model uncertainty into account when making model-based predictions?

The deadline is Wednesday 11th of March 2026 at 15:00. Note: late submissions will not be graded. You are then only allowed to hand-in during the resubmission opportunity (resubmission deadline: Friday 27th of March 2026 at 23:30).