Modelling in python¶
How can I load the best saved parameter sets?
Assuming you have saved your found parameter sets as a dictionary in a .json file, you can use the following function to automatically load the best parameter values.
def load_best_parameters(folder, param_key='x', cost_key='cost'):
"""Load the best parameters from a folder with json files. The best parameters are the ones with the lowest cost.
The function assumes that the json files have the following structure:
{
"cost": [cost of the parameters]
"x": [list of parameters],
}
Args:
folder (string): The folder to search for the results json files.
param_key (str, optional): The key in the json file for parameter values. Defaults to 'x'.
cost_key (str, optional): The key in the json file for the cost. Defaults to 'cost'.
Returns:
list: The best parameter values.
"""
files = os.listdir(folder)
if len(files)==0:
return []
results = []
for file in files:
if file.endswith(".json"):
with open(f"{folder}/{file}", 'r') as f:
results.append(json.load(f))
best_result = min(results, key=lambda x: x[cost_key])
return best_result[param_key]
Then you can load the best parameter sets in a similar way as below. Note that the function defaults to using the keys cost for the cost, and x for the parameter values, but you can switch keys by passing in param_key or cost_key
params = load_best_parameters('results', param_key='x', cost_key='cost')
How can I hide the CVODE error warnings?
One solution is to create a function which can capture the prints to the "standard error" channel.
from contextlib import contextmanager
Python function \- not a citation.
def silent_errors(stdchannel=sys.stderr, dest_filename=os.devnull):
try:
oldstdchannel = os.dup(stdchannel.fileno())
dest_file = open(dest_filename, 'w')
os.dup2(dest_file.fileno(), stdchannel.fileno())
yield
finally:
if oldstdchannel is not None:
os.dup2(oldstdchannel, stdchannel.fileno())
if dest_file is not None:
dest_file.close()
Then wrap the function which throws the warnings (here exemplified with differential_evolution) with your created function (here named silent_errors)
with silent_errors():
results=differential_evolution(...)
Please note that this does not work when running in an ipython environment due to how "standard error" is handled.
How can I save a numpy array to a json file?
You can manually convert numpy arrays to e.g lists. But, an easier approach is to define a custom JSON parser, to automatically convert numpy arrays as necessary. You can then give this parser as an optional argument (cls) to the json.dump.
from json import JSONEncoder
class NumpyArrayEncoder(JSONEncoder):
def default(self, o):
if isinstance(o, np.ndarray):
return o.tolist()
return JSONEncoder.default(self, o)
json.dump(f, my_variable, cls=JSONEncoder)
How can I run multiple optimizations without having to rerun the whole script?
The simplest way is to simply wrap the optimization part inside a for loop. In some cases, it might be more comprehensible if you also put the optimization inside a separate function.
for i in range(20):
# Put your optimization code here (make sure to only loop the parts that are not static)
res = optimization_algorithm(...) # Your optimization algorithm of choice here.
x0 = res['x'] # Set the found optima to the starting guess for the next iteration
??? question "I get the same solution all the time, or my optimization seem to be stuck in a local minima":
One easy way is to sometimes perturb the start guess (x0) a little bit. The example below adds up to 5% difference in 10% of the optimizations. Remember to make sure that the new solution should still be within the bounds.
```python
if np.random.rand(1)<=0.1: # with a small chance up to 5% noise to the starting guess
x0 = x0 * np.random.uniform(0.95, 1.05, len(x0))
x0 = np.clip(x0, bounds.lb, bounds.ub)
```
How can I run my optimizations in parallel?
One way to drastically improve the speed of the optimization is to run multiple optimizations in parallel, rather than running them one by one. To do this, you need to create a separate function for the optimization, then run that function multiple times, e.g. using multiprocessing. Here is an example of how to setup the optimization function. Of course you can use some other logic for your optimization function, below is just an example.
# Define constants needed for the optimization
MODEL_NAME = 'my_model_name'
RESULTS_PATH = f'results/{MODEL_NAME}'
PRINT_OPTIMIZATION_ITERATIONS = False
# Define the optimization function
def optimize(input_args):
bounds, args = input_args # unpack the input arguments
x0_log = np.log(load_best_parameters(RESULTS_PATH, param_key='x', cost_key='cost'))
if np.random.rand(1)<=0.05: # with a small chance add some noise to the starting guess
x0_log = np.log(np.exp(x0_log) * np.random.uniform(0.95, 1.05, len(x0_log)))
x0_log = np.clip(x0_log, bounds.lb, bounds.ub) # make sure that the starting guess is within the bounds
with silent_errors():
result = differential_evolution(cost_log, bounds, args=args, x0=x0_log, disp=PRINT_OPTIMIZATION_ITERATIONS)
result['x'] = np.exp(result['x']) # transform back from log space
with open(f"{RESULTS_PATH}/result_{result['fun']:.2f}.json", "w") as f:
json.dump({"cost": result['fun'], "x": result['x']}, f, cls=NumpyArrayEncoder)
return result
Then, setup for and call the function in a parallel pool.
from multiprocessing import Pool
args = (sims,)
bounds = Bounds([np.log(1e-5)]*len(x0), [np.log(1e5)]*len(x0))
args_list = [(bounds, args)]*NUMBER_OF_OPTIMIZATION_RUNS
with Pool() as p:
p.map(optimize, args_list)
How can I get a nice bar which shows the progress of the optimization?
Using a similar approach as in the parallel approach above, we now use tqdm to get a progressbar.
Note that it is advised to not print the iterations of the optimization when using this approach.
import tqdm
with Pool() as p:
with tqdm(total=NUMBER_OF_OPTIMIZATION_RUNS) as pbar:
for _ in p.imap_unordered(optimize, args_list):
pbar.update()