Skip to content

Lab: AI and hybrid modelling

The purpose of this lab is to get an overview of different types of models and usages of AI models in health care and biology. You will also learn how different kinds of models can be combined in hybrid model schemes.

Updated version 2025

The lab has been updated for this year. If you find some oversights or something that is not working correctly, please reach out to the teachers.

Throughout the lab you will encounter different "admonitions", which is a type of text box. A short legend of the different kinds of admonitions used can be found in the box below:

Different admonitions used (click me)

Background and Introduction

Useful information

Guiding for implementation

Here you are tasked to do something

Reflective questions

Lab setup

Check the instructions below and decide if you are running the lab on a local python setup on your computer, in the computer hall, or through a cloud service.

If you haven't done this, follow the general Get started instructions before you start with this lab. But, in short you need:

  1. Python installation running on your computer (or a computer in the computer hall)
  2. A C-compiler installed
  3. Python packages (which is done in the package installation step below).
  4. A text editor or IDE to write code in. If you have no preference, we recommend using VS Codium or VS Code.
  5. Downloaded and extracted the scripts for Lab 4A: Lab4A files

Package installation
To install the packages required for this lab we recommend using uv as your package manager, see installation instructions.

You need to located in the same folder as the pyproject.toml file that was included in the downloaded files.

uv sync

You need to located in the same folder as the requirements.txt file that was included in the downloaded files.

pip install -r requirements.txt

If you are using a computer in the computer hall, Python and a valid c-compiler is already installed. You only need to download and extract the scripts for Lab 4A: Lab4A files

Navigate to the folder with the downloaded scripts in the terminal, and then create a new virtual environment and install the required packages:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Now, open the project in your preferred text editor or IDE. If you have no preference, we recommend using vc code which should be available in the computer hall.

The lab is provided both as a python variant and a notebook variant - please note that the notebook is dependent on the given file structure as it imports some functions from the sub-directories.

For more details on the toolbox we use to simulate the models (SUND), follow the link for the documentation.

Lab Overview: The Hybrid Modeling Pipeline

This lab demonstrates how different types of models can work together in a hybrid modeling approach to solve complex biomedical problems. You'll follow a complete pipeline that starts with raw medical imaging data and progresses through multiple modeling stages to ultimately predict disease risk and simulate interventions.

The lab is built on a sequential workflow that includes:

  1. Automized Image Processing - Use deep learning to extract physiological data from medical images
  2. Data Imputation - Handle missing patient data using machine learning techniques
  3. Disease Risk Calculation - Apply different modeling approaches to assess stroke risk
  4. Scenario Simulation - Use mechanistic models to predict long-term health outcomes

Throughout this journey, you'll see how mechanistic models (based on biological principles) and machine learning models (data-driven) can complement each other to create more powerful and comprehensive solutions than either approach alone.

How to pass (Click to collapse)

At the bottom of this page you will find a collapsible box with questions. To pass this lab you should provide satisfying answers to the questions in this final box in your report (found at the end of this page). The questions can be answered in a point-by-point format but each answer should be comprehensive and adequate motivations and explications should be provided when necessary. Please include figures in your answers where it is applicable.

Automized image processing

Modelling usually need a lot of data to be useful, especially in different kinds of image analysis - a common usage of AI. When using imaging data, the data usually have to be processed in different ways before it can be used. This preprocessing of data is often time consuming to do by hand.

In this part of the lab, you will process training data for a mathematical model using a deep learning network. More specifically, you will use the deep learning network to automatically segment the heart chambers in 4D flow MRI images, and then use this segmented data to train a cardiovascular mechanistic model describing the left ventricle.

Task 1: Run the DL network to get a segmentation of the left ventricular volume
  • Run the network on the data in the folder Automized_image_processing/DLnetwork/data/input/
  • Plot and analyze the resulting segmentations
The deep learning network

Segmenting the whole heart over the cardiac cycle in 4D flow MRI is a challenging and time-consuming process, as there is considerable motion and limited contrast between blood and tissue. To speed up this process, this deep learning-based segmentation method was developed to automatically segment the cardiac chambers and great thoracic vessels from 4D flow MR images. A 3D neural network based on the U-net architecture was trained to segment the four cardiac chambers, aorta, and pulmonary artery. Since deep neural networks learn most of there structure from the data, without using any a priori knowledge, they can be called machine learning models.

For more information, see the paper by Bustamante et al: Automatic Time-Resolved Cardiovascular Segmentation of 4D Flow MRI Using Deep Learning.

Run the network and plot the results

To run the network, open the Automized_image_processing/DLnetwork folder in your editor/IDE. Read the README-file to get a better understanding of all the files.

Run main.py. The main file is constructed to put together terminal commands and post the request for you, depending on the flags you call the function with. By default, all methods for initializing, build, calling, analysing results, and plotting are called - the code for these methods are located in the scripts folder.

Note: the network could take around 10 minutes to run on a normal laptop. You can see the progress in the terminal window.

If you cannot get the network running, ask the lab supervisor or another student to get the resulting segmentations to be able to continue with the lab.

Look at the segmentation results.
  • Do you think the segmentation is correct? Why?
  • Does the left ventricular volume seem reasonable? Why?
Task 2: Run the cardiovascular model to simulate the left ventricular volume during a heart beat

Now navigate to the Cardiovascular_model/ located on the same level as the DLnetwork folder (within the Automized_image_processing folder).

  • Load the analyzed left ventricular volume data from Task 1.
  • Do a model simulation
  • Plot the model simulation of the ventricular volume together with the deep learning-based ventricular volume
The cardiovascular model

Figure1

Figure 1: The cardiovascular model including the left ventricle.

The cardiovascular model here is an expanded version of the model you used in the model formulation lab. Instead of a sinus wave-based input to the aorta, a left ventricle is added which contracts and relaxes to create a more realistic outflow to the aorta. This cardiovascular model is formulated as a ODE-system instead of a DAE-system.

Load the data

You will need to add code to load the analyzed left ventricular volume data from the DLnetwork, by default located in Automized_image_processing/DLnetwork/data/analyzed/. The .npz file type is loaded with numpy

# Path to your LV volume data NPZ file
data_path = "Automized_image_processing/DLnetwork/data/analyzed/LV_volume.npz"

# Load the data
npz_data = np.load(data_path)

We are then interested with the lv_volume_aligned and time_points_aligned fields that we will pass on to the plot function.

# Extract the arrays
lv_volume = npz_data['lv_volume_aligned']
time_points = npz_data['time_points_aligned']

# Close the npz file
npz_data.close()

We will now use the data for the left ventricular volume to compare with the model simulation. This has been prepare for you to simply run the Automized_image_processing/Cardiovascular_model/cardiovascular_model.py file, note that you are expected to navigate (change path) to this folder.

Look at the plot of the model simulation and the data.
  • Can the model describe the data?
  • What is the aortic pressure? Is it reasonable? Why/why not?
Task 3: Fit the cardiovascular model to the segmentation-based left ventricular volume
  • Load the analyzed left ventricular volume data from Task 1.
  • Optimize the model parameters to fit the left ventricular volume to data.
  • Plot the resulting simulations of left ventricular volume compared to data, and the predicted aortic pressure.
  • Note the diastolic and systolic pressure - these results will be used in the next task.

Open parameter_estimation.py in the folder Cardiovascular_model and add code to load the left ventricular data as in Task 2. The optimizer differential_evolution has been prepared as the default optimization solver, but you are free to choose your preferred optimizer from the Parameter estimation lab. When everything is setup, run the parameter_estimation.py script to fit the cardiovascular model to the left ventricular volume data.

Efficient parameter estimation

To make the parameter estimation more efficient the default parameter bounds are set to be fairly narrow. If you would like to make the parameter estimation process even faster, one could do this by:

  • Only estimate the values of the model parameters governing the contraction and relaxation of the left ventricle (Emax_LV,Emin_LV,k_diast_LV,k_syst_LV,m2_LV,m1_LV,onset_LV).
  • Reduce the allowed number of iterations (maxiter) in the optimization solver settings to make sure that it does not take too long time.

Note that these suggestions are optional.

Look at the plot of the model simulation and the data.
  • Can the model describe the data?
  • What is the aortic pressure? Has it changed? Is it reasonable?
  • Do you think that the data of the left ventricular volume is enough to be able to reliable predict the aortic pressure?
  • Do you think that the model complexity is good for this amount of data, or could it be smaller or bigger? Look at Figure 1 and discuss.

From Individual Patient Data to Population Analysis

The previous section demonstrated how deep learning can automatically extract physiological measurements (like left ventricular volume) from medical images, and how mechanistic models can interpret this data to predict clinically relevant parameters (like blood pressure).

However, in real healthcare applications, we rarely work with just one patient's complete dataset. Instead, we typically have incomplete data across many patients - some patients may have imaging data, others may have blood pressure measurements, but rarely do we have complete information for everyone.

This brings us to the next challenge in our hybrid modeling pipeline: how do we handle missing data and make predictions for patients with incomplete information? This is where machine learning approaches for data imputation become essential.

Impute data

Missing data is commonplace in health care. Data could be missing due to cost (some measurement techniques are more expensive than others), measurement errors, lack of compliance, etc. When doing statistical analysis on data with a lot of missing data, the results can be biased or less representative. One way of handling the problem of missing data is to do an imputation, i.e. replacing missing data with substitute values.

In this part, you will impute the missing data in database.csv using k-nearest neighbors (kNN).

Task 4: Impute patient data using kNN
  • Add the SBP and DBP values from Task 3.
  • Try different k values for the KNN process.

Firstly, navigate to the Impute_data folder. In the given dataset (database.csv), the systolic and diastolic pressure for the patient on row 1 are missing. We will use the values obtained from Task 3 for this row and row 1 will now be referred to as Patient 1. The main_impute.py script is prepare for you to add you SBP and DBP values as variables that then update the csv file.

k-nearest neighbors (kNN)

kNN is an algorithm that assigns a class or value to new data points based on their k number of nearest neighbors. The kNN model has one parameter and one equation that has to be decided. The parameter is the parameter k - how many nearest neighbour the classification will be based of. A small value of k means that any data noise will have a higher influence on the results. A large value of k make it computationally expensive and makes the prediction less precise, since you include neighbors that are more dissimilar in the prediction. So the choice of k ultimately depends on your data set. These kinds of parameters that impact the effect of model learning are called hyperparameters, and can also be optimized. The equation that has to be chosen is the distance metric - how the distance to each neighbour is calculated. Two common distance metrics for continuous variables are Euclidean or Mahalanobis distance. Both these distance metrics use the Pythagorean theorem to calculate the distance, but Mahalanobis also accounts for correlation between variables, making it more suitable for datasets with more than two variables. Herein we will therefore use the mahalanobis distance.

Figure2

Figure 2: Example of kNN classification. The test sample (green dot) should be classified either to blue squares or to red triangles. If k = 3 (solid line circle) it is assigned to the red triangles because there are 2 triangles and only 1 square inside the inner circle. If k = 5 (dashed line circle) it is assigned to the blue squares (3 squares vs. 2 triangles inside the outer circle).

You should also set the value of variable k (you can read about the parameter k above) and then also add a code segment that saves the results following the imputation.

Save the results

The csv file format are fairly is to browse and get an overview of large data sets. One can save python data as a csv file using the pandas package.

# Save complete imputed dataset using pandas.DataFrame() and to_csv()
df_imputed = pd.DataFrame(imputed_data, columns=header)
df_imputed.to_csv(f"./Impute_data/database_imputed_k_{k}.csv", index=False)

# Save patient 1 as CSV
patient_1_df = pd.DataFrame([imputed_data[0, :]], columns=header)
patient_1_df.to_csv(f"./Impute_data/patient_1_k_{k}.csv", index=False)

f-strings f"" allows us to parse variables as strings and is useful when we want to include a variable in the name of a file.

Now you can run the imputation. Try to compare the resulting imputed data for some different k's and chose one that you think gives reasonable results. You can read about the different variables in the database down below.

Overview of the database

A prospective cohort of mostly diabetics and prediabetics.

Variables Description
AGE: in years
SEX: coded as 1 = male 2 = female
BMI: (kg/m^2) body mass index, calculated as weight/height^2
<18.5 – underweight
18.5 - 24.9 – normal weight
25 - 29.9 – overweight
0 - 34.9 – obese
35< - extremely obese
CPD: average cigarettes smoked per day
DBP: diastolic blood pressure in mmHg. The diastolic blood pressure is the lowest pressure during the heart cycle, measured when the heart is refilling with blood.
60 - 80 – normal or elevated
80 - 89 – high, stage 1
> 90 – high, stage 2
> 180 – hypertensive crisis

|SBP: |systolic blood pressure in mmHg. The systolic blood pressure is the highest pressure during the heart cycle, measured when the heart is contracting.| | |90 - 120 – normal| | |120 - 129 – elevated| | |130-139 – high, stage 1| | |> 140 – high, stage 2| | |> 180 – hypertensive crisis| |DMRX: |diabetes diagnosis, coded as 0 = no diabetes 1 = diabetes| |AF_beforestroke: |atrial fibrillation diagnosed before stroke coded as 0 = no atrial fibrillation and 1 = atrial fibrillation| |STROKE: |stroke, coded as 0 = no stroke, 1 = stroke|

Look at the imputed data
  • How could k be chosen in a better way? What would you need to do that?
  • Can you think of any other type of model you could use to impute this data?
  • Compare patient 1 with all other patients - can you see any problem with using this specific dataset to impute data for patient 1?

From Complete Data to Clinical Decision Making

Now that we have addressed the missing data problem using machine learning-based imputation, we have a more complete patient profile. Our hybrid modeling approach has taken us from raw medical images (processed with deep learning) to complete patient data (using kNN imputation).

The next critical step in healthcare applications is translating this data into actionable clinical insights. This means moving from "what are the patient's measurements?" to "what is their risk of developing disease?"

In this section, we'll explore how different modeling approaches can be used to estimate disease risk, comparing: - Simple statistical approaches (looking at population averages) - Clinical risk scores (established medical guidelines)
- Advanced machine learning models (ensemble methods)

This comparison will highlight a key aspect of hybrid modeling: different models may be optimal for different purposes, and understanding their strengths and limitations is crucial for clinical decision-making.

Calculate disease risk

Estimating disease risks is an integral part of preventative health care. Things that affect an estimated risk are which risk factors you include, and what algorithm if any you use to calculate it.

Task 5: Calculate the current risk of getting a stroke within 5 years for patient 1

Try to calculate the stroke risk for patient 1, using the data you just imputed, in 3 different ways:

  • i) looking at the imputed value of stroke risk for patient 1 from Task 4.
  • ii) using the Framingham risk score.
  • iii) using an ensemble risk model.
Framingham risk score

You can calculate the risk score using this web based calculator: Framingham Risk Score for Hard Coronary Heart Disease. Note that you don't have all the data that the calculator asks for.

Below you can read about ensemble models.

Ensemble model

Ensemble models use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the used learning algorithms on their own. This specific ensemble model is a combination of logistic regression model for 4 different age groups: under 50, 50-59, 60-69 and over 70. These age-specific models capture the non-proportionality of risk factors by age and are better calibrated to the younger and older populations. By combining the age-specific models into one ensemble model we get a smooth transition of how risk changes over the years. You can read more about this risk model in Age Specific Models to Capture the Change in Risk Factor Contribution by Age to Short Term Primary Ischemic Stroke Risk and in Digital twins and hybrid modelling for simulation of physiological variables and stroke risk.

Now navigate to the Calculate_disease_risk folder. The ensemble risk model has been prepared for you, more specifically in the stroke_risk_model.py file. We will use the imputed data patient 1 that you saved in task 4. Open the script run_stroke_model.py and load the patient 1 data and then write a code snippet to save the results to a file. Then run that ensemble risk model by calling the run_stroke_model.py file.

Load the data

You can load data from a .csv (which was the proposed way to save the patient 1 data in Task 4) using the csv.reader() function.

# load patient 1 data
file_directory = "path/to/your/file.csv"
with open(file_directory, "r") as file:
    patient1_data = list(csv.reader(file))
Save the results

To save the risk score value, you could use the .json file format.

# add code to save the risk scores
with open("./Calculate_disease_risk/risk_scores_patient_1.json", "w") as f:
    json.dump(risk, f, cls=NumpyArrayEncoder)
Compare the risks of the different models
  • Which of these predictions do you find most useful and trustworthy?
  • What pros and cons do you see with the different prediction models?

Now you should repeat the risk score calculation for a new patient 2.

Task 6: Calculate current risk for Patient 2

Come up with values for an imaginary patient 2, that is the same age as patient 1, that you think would give a lower risk of stroke than for patient 1 (you can e.g. copy your patient 1 file, and change the name of the file you load in the scrip). Calculate the risk for patient 2 using the ensemble model, and try out different variable values until you have a patient 2 that has a lower risk of stroke than patient 1.

Compare the risks of the different patients
  • What do you think is the biggest contributor to disease risk in patient 1 and patient 2's age group?

From Current Risk to Future Predictions and Interventions

The previous sections have shown us how to assess a patient's current risk using various modeling approaches. However, one of the most powerful applications of hybrid modeling is the ability to predict how risk will change over time and evaluate the potential impact of different interventions.

This is where mechanistic models truly shine in our hybrid approach. While machine learning models excel at pattern recognition in current data (as we saw with image segmentation, imputation, and risk assessment), mechanistic models can:

  1. Simulate biological processes over time - predicting how patient physiology evolves
  2. Model intervention effects - evaluating how treatments, lifestyle changes, or medications might alter disease progression
  3. Provide biological insights - explaining why certain interventions work, not just that they work

In this final section, we'll use mechanistic models to simulate long-term scenarios for our patients, demonstrating how the complete hybrid modeling pipeline enables not just diagnosis, but personalized treatment planning and preventive care.

Simulating scenarios

Managing disease risks involves predicting the evolution of risk factors over time, as well as predicting the individual effects of different interventions, such as changes in lifestyle, diet, drugs, etc. One way of doing such predictions is by using mechanistic models.

Task 7: Simulate scenarios using a multi-level mechanistic model

Simulate three scenarios based on patient 1 and patient 2 with the mechanistic model described below. Simulate for 10 years or longer. Try to come up with scenarios you think would lower the risk for patient 1 and increase the risk for patient 2 over time. Then, at around half of your simulation time, add some intervention to patient 2 so that the risk decreases. You should then have a plot of three scenarios.

Mechanistic model

The mechanistic model is a multi-level multi-time scale, meaning that it consists of several biological levels - in this case cell-, organ/tissue-, and whole body-level -, and that it can be simulated on different time scales - from minutes up to years. The model consist of two parts. One part of the model describes processes relevant to glucose regulation - in a fat cell and in relevant organs during a meal, changes in weight due to diet and subsequent changes in insulin resistance. The model can also get diabetes. The model takes change in energy intake (dEI) in kcal as input. A normal daily intake consists of about 2000-2500 kcal. The model can handle/has been evaluated on a dEI of >250 and 4000> kcal dEI. The other part describes changes in systolic and diastolic blood pressure due to ageing and blood pressure medication. It is not as detailed or as individualized as the cardiovascular model in Task 2 and 3, but it can be simulate for longer time periods. You can read more about the model at Digital twins and hybrid modelling for simulation of physiological variables and stroke risk and A multi-scale digital twin for adiposity-driven insulin resistance in humans: diet and drug effects

Figure3

Figure 3: Overview of the diabetes part of mechanistic multi-level model. A) Whole-body level. The body composition model takes change in energy intake as input, i.e., the difference in energy intake (EI) and energy expenditure (EE). This difference translates to the outputs: changes in the masses of fat (F), lean tissue (L), and glycogen (Gly). The total sum of these masses is the body weight (BW). The insulin resistance model (green box) takes the change in fat mass (xF) as input. B) The following factors influence the glucose concentration on the tissue/organ level: the insulin resistance, xF, the change in lean tissue (xL), and BW. More specifically, insulin resistance (green short arrows) increases endogenous glucose production (EGP) and insulin secretion (Ipo), and decreases glucose uptake in both muscle (U_idm) and liver tissue (U_idl). Furthermore, xL increases U_idm, xF increases glucose uptake in fat tissue (U_idf), and BW increases the rate of appearance of glucose (Ra). C) Finally, the amount of insulin in fat tissue translates to insulin input on the cell level. More specifically, insulin binds to the insulin receptor (IR), causing a signalling cascade that ultimately results in glucose transporter 4 (GLUT4) being translocated to the plasma membrane to facilitate glucose transport. The marked reactions on the cell level are the protein expressions of IR and GLUT4 (black arrows going to), the effect of insulin resistance on the protein expression of IR and GLUT4 (green arrows), as well as the degradation of IR and GLUT4 (black arrows going out). These reactions enable a gradual decrease in IR and GLUT4, moving the cell towards diabetes.

We also include a smaller model to describe the systolic and diastolic blood pressure (SBP/DBP) and how they are affect by a drug treatment.

Figure4 Figure 4: Overview of the blood pressure model. SBP is systolic blood pressure, DBP is diastolic blood pressure, and both are affected by both age and by the drug.

Navigate to the Simulating_scenarios folder. Here, the main.py file is prepare for you to interact with the mechanistic model. Start by loading the data for patient 1 and 2.

Loading data from csv format

To load data from a .csv file wou can use the following code snippet.

file_name_patient = ""
with open(file_name_patient, "r") as file:
    data = list(csv.reader(file))

To govern the simulation and the interventions, you should change the values in the following variables:

  • t_simulation - the duration of the simulation
  • t_bp_medication - if and when to add blood pressure medication
  • p_dEI - how much, if any, change in energy intake to add

You will then simulate the mechanistic model by running the main.py. The scripts will plot the variables that can be used as input to the risk model. Note that you also need to add code for saving your new time-resolved results.

Saving your results

You will need the data in for the next step to do new risk score calculations. You should therefore save the data in the same structure with the column/variable names as the csv file used as the input to the risk model for the previous steps. As the names do not match between the column names and the model feature names we can define a mapping between them.

# map the column names in the database to the feature names in the simulation
column_feature_map = {
    "AGE": "Age",
    "SEX": "sex",
    "BMI": "BMI",
    "CPD": "CPD",
    "DBP": "DBP",
    "SBP": "SBP",
    "DMRX": "Diabetes",
    "AF_beforestroke": "AF_beforestroke",
    "HEIGHT": "height",
    "STROKE": "STROKE", 
    }

Thereafter, we can use this mapping to extract the simulation result

# loop over the database header and extract the corresponding values from the simulation
for key, value in column_feature_map.items():
    # add the header keys to the results dictionary
    patient1_results[key] = []
    patient2_results[key] = []

    # add the simulated values to the results dictionary
    if value in sim_patient1.feature_names:
        patient1_results[key] = sim_patient1.feature_values[:, sim_patient1.feature_names.index(value)].tolist()
        patient2_results[key] = sim_patient2.feature_values[:, sim_patient2.feature_names.index(value)].tolist()

However, not all columns are available as features and those cases we propagate the given value for the patient over the time vector.

    else:
        # if the key is not in the simulated features, it is a constant value
        if key == "CPD":
            patient1_results[key] = [float(patient1_data[idx_cpd])] * len(sim_patient1.time_vector)
            patient2_results[key] = [float(patient2_data[idx_cpd])] * len(sim_patient2.time_vector)
        elif key == "AF_beforestroke":
            patient1_results[key] = [float(patient1_data[idx_af])] * len(sim_patient1.time_vector)
            patient2_results[key] = [float(patient2_data[idx_af])] * len(sim_patient2.time_vector)
        elif key == "STROKE":
            patient1_results[key] = [float(patient1_data[idx_stroke])] * len(sim_patient1.time_vector)
            patient2_results[key] = [float(patient2_data[idx_stroke])] * len(sim_patient2.time_vector)

Finally, we can save the results to the csv file. We do this by defining the header as the column names and then iterating over the values for each time point. Update the file name and repeat the method for the second patient.

# save the results to a csv file
file_name = "./Simulating_scenarios/result_file_name.csv"
with open(file_name, "w", newline='') as file:
    writer = csv.writer(file)
    writer.writerow(database_header)  # write header
    for i in range(len(sim_patient1.time_vector)):
        row = [patient1_results[key][i] for key in database_header]
        writer.writerow(row)
Compare your simulations for patient 1 and 2
  • Do you think the simulations look reasonable? Why/why not?
  • Could you simulate all the scenarios you wanted to simulate?
Task 8: Calculate an updated continuous risk score using the simulated scenarios

The ensemble model can be used to calculate a continuous risk over time if your input data is continuous. Using your simulated scenarios as input, you can therefore get a prediction of the risk given the scenario you simulated, and compare different scenarios.

Run risk model with continuous data

Load the time continuous data you got from the last step and give it as input to the ensemble risk model. You can return to the files we used during Task 5 and 6 and simply load this new data.

Compare continuous risk scores for patient 1 and 2
  • Do you think the risk scores look reasonable? Why/why not?
  • Which variables seems to contribute the most to the risk score as the patients age?
  • What would you like to add to the mechanistic model if you could?
Task 9: Why do glucose levels increase in the mechanistic model when body fat increases?

In your scenarios you might have added an increased energy intake and given the model diabetes. If not, try to simulate such a scenario now. Can you figure out what mechanisms in the model that drives the progression to diabetes?

Simulate relevant variables from mechanistic model to see what changes when going from diabetic to non-diabetic. You can do this by adding plots for the features in the model file that you are interested in observing. The insulin resistance has effects on 6 different states and variables in the model, find at least three of these.

Lab Conclusion: The Power of Hybrid Modeling

Congratulations! You have now completed a comprehensive hybrid modeling pipeline that demonstrates the synergistic power of combining different modeling approaches.

What you've accomplished:

  1. Deep Learning → Automated extraction of physiological data from medical images
  2. Machine Learning → Intelligent handling of missing patient data
  3. Statistical/Clinical Models → Assessment of current disease risk
  4. Mechanistic Models → Prediction of long-term health trajectories and intervention effects

This pipeline showcases how no single modeling approach could have achieved the complete solution. Each type of model contributed its unique strengths:

  • Machine learning models excelled at pattern recognition in complex, high-dimensional data
  • Mechanistic models provided biological insight and predictive capability over time
  • Hybrid combinations enabled comprehensive analysis from raw data to clinical decision support

This approach represents the future of computational medicine: leveraging the best of each modeling paradigm to create more powerful, interpretable, and clinically relevant solutions than any single approach could provide.

Questions for lab 4A - Hybrid modelling:

In your lab report, you should answer the following questions:
  • Now that you have tried some different kinds of models, both in this lab and in the others, what disadvantages and advantages as well as usages do you see with mechanistic models?
  • ... machine learning models? (ensemble model, kNN, deep neural network, etc)
  • Show your plot of the continuous risk scores for patient 1 and 2 and describe the scenarios that gave these scores. Describe shortly all the modelling steps that you did to get this plot and reasons for uncertainty in all these steps.
Help with DL network

If you have troubles running the DL networking you can download the solution to continue with Task 2.