Mathematical models in Jupyter#

Introduction#

Jupyter is a fully-functional alternative to Mathematica or Maple notebooks for developing and analyzing mathematical models in biology (or any other discipine, for that matter). For this, you will need to use a Computer Algebra System (CAS). A CAS is software that emulates manual (pen-and-paper) manipulations of mathematical expressions. Yes, it can be done, and very effectively, for a vast array of mathematical problems! A CAS combined with a graphing/plotting package like matplotlib gives you a powerful tool for mathematical modelling using a Jupyter notebook.

We will use Python’s SymPy package, which provides powerful CAS features for most common mathematical modelling problems.

There is also Sage, a more capable CAS. We will not use it here because is not a regular Python package, but rather, uses Python as its programming language. So unlike SymPy it cannot just be loaded in a Jupyter nb with a Python kernel. Instead, you will need to install its Jupyter kernel. You can install and try sage outside of Jupyter if you want.

So let’s use SymPy in Jupyter.

If you used Anaconda to install Jupyter, it should already include SymPy, Matplotlib, IPython, NumPy, and other useful packages for scientific computing. If you don’t have SymPy for some other reason, install it in Linux/mac using:

$ sudo apt install python3-sympy

Otherwise, follow the instructions here.

We also need to rum some commands so that the plots appear correctly:

%matplotlib inline
import matplotlib.pyplot as p

And import scipy and sympy:

from sympy import *
import scipy as sc
init_printing() # for pretty-printing equations etc

Some preliminaries#

Before we get started with our mathematical modelling session in Jupyter, some SymPy preliminaries.

Symbolic variables#

In CAS’ like SymPy, we need to create symbolic variables for the mathematical variables we want to work with. A new symbolic variable can be created using var. Try this:

x = var('x')
type(x) # check it's class
sympy.core.symbol.Symbol

You can also define multiple symbolic variables at one go:

a, b, c = var("a, b, c")

For more info on symbolic variables, have a look at this.

It is often important to add assumptions (constraints) to our symbolic vars:

x = var('x', real=True)

Now check:

x.is_imaginary
False
x = Symbol('x', positive=True)

Again, check:

x > 0
../_images/Appendix-Maths_18_0.png
x < 0
../_images/Appendix-Maths_19_0.png

Symbolic equations#

We can define the mathematical equations (functions) that we will be using/manipulating as follows:

MyFun = (pi + x)**2; MyFun
../_images/Appendix-Maths_21_0.png
MyFun = N_0 + (N_max - N_0) * exp(-exp(r_max * exp(1) * (t_lag - t)/((N_max - N_0) * log(10)) + 1))

See the nice \(\LaTeX\) - formatted output: this is where init_printing() comes handy.

SymPy has predefined expressions for a number of mathematical constants, such as: `pi` ($\pi$), `e` (exponential), `oo` (infinity).

You can also get your equation in latex syntax! Try:

latex(MyFun)
'\\left(x + \\pi\\right)^{2}'

That has extra escape slashes for Python to be able to parse it correctly. To display it in its actual form (that you can directly use in a \(\LaTeX\) document), print it:

print(latex(MyFun))
\left(x + \pi\right)^{2}

Numerical evaluation#

To evaluate an expression numerically we can use the evalf function (or N). It takes an argument n which specifies the number of significant digits.

pi.evalf(n=100) # pi to a 100 places after decimal!
../_images/Appendix-Maths_30_0.png

N() is shorthand alias for evalf():

N(pi, 50)
../_images/Appendix-Maths_32_0.png

So let’s try evaluating our function:

N(MyFun, 5)
../_images/Appendix-Maths_34_0.png

When we numerically evaluate algebraic expressions we often want to substitute a symbol with a numerical value. In SymPy we do that using the subs function:

MyFun.subs(x, 1.5)
../_images/Appendix-Maths_36_0.png

Now let’s evaluate it:

MyFun.subs(x, 1.5)
../_images/Appendix-Maths_38_0.png

The subs function can also be used to substitute mathematical variables or expressions. Let’s substitute \(x\) with \(a+\pi\):

MyFun.subs(x, a+pi)
../_images/Appendix-Maths_40_0.png

And assign it as a new symbolic equation for using later:

MyFun_new = MyFun.subs(x, a+pi); MyFun_new
../_images/Appendix-Maths_42_0.png

We can also numerically evaluate the function over a range of values using NumPy arrays:

x_vec = sc.arange(0, 10, 0.1)
MyFun_vec = sc.array([N(MyFun.subs(x, xx)) for xx in x_vec]) #Note: using a list comprehension!

We can also evaluate the new function MyFun_new we created by substitution above:

MyFun_new_vec = sc.array([N((MyFun_new).subs(a, xx)) for xx in x_vec])

Now plot the two functions that you evaluated (try adding axes and a legend to these basic plots).

fig, ax = p.subplots()
ax.plot(x_vec, MyFun_vec)
ax.plot(x_vec, MyFun_new_vec)
[<matplotlib.lines.Line2D at 0x7fbaa6a1ad30>]
../_images/Appendix-Maths_49_1.png

However, numerical evaluation using evalf() can be very slow. There is a much more efficient way to do it by using lambdify() to “compile” a Sympy expression into a function that is much more efficient to evaluate numerically:

MyFun_lamb = lambdify([x], MyFun, 'numpy')

The first argument is a (python) list of variables that MyFun_lamb will be a function of. In this case its only \(x\). Now we can directly pass a numpy array and MyFun is evaluated more efficiently:

MyFun_vec = MyFun_lamb(x_vec)

The speedup when using “lambdified” functions instead of direct numerical evaluation can be significant, often several orders of magnitude. Even in this simple example we get a significant speed up:

%%timeit #remember this?

MyFun_vec = sc.array([N(((x + pi)**2).subs(x, xx)) for xx in x_vec])
27.7 ms ± 7.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit

MyFun_vec = MyFun_lamb(x_vec)
2.64 µs ± 684 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Fundamental mathematical operations#

Let’s look at some fundamental methematical operations in SymPy that you will almost certainly use at some point in biological models. You can find a full list and addtional tutorials and examples at SymPy documentation.

Expand and factor#

expand(MyFun)
../_images/Appendix-Maths_60_0.png

The expand function takes a number of keywords arguments which we can tell the functions what kind of expansions to want to perform; use help(expand) (or the SymPy documentation) for more info.

You can also print the result of an manipulation in its raw python form:

print(expand(MyFun))
x**2 + 2*pi*x + pi**2

You can also factor using, well, factor():

factor(x**2 + 2*pi*x + pi**2)
../_images/Appendix-Maths_65_0.png

Apart and together#

To manipulate symbolic expressions of fractions, you can use the apart and together functions:

f1 = 1/((a+1)*(a+2)); f1
../_images/Appendix-Maths_68_0.png
apart(f1)
../_images/Appendix-Maths_69_0.png
f2 = 1/(a+2) + 1/(a+3); f2
../_images/Appendix-Maths_70_0.png
together(f2)
../_images/Appendix-Maths_71_0.png

Simplification#

The simplify tries to simplify an expression into a nice looking expression, using various techniques. More specific alternatives to the simplify functions also exist: trigsimp, powsimp, logcombine, etc. Applying simplify to the above example will give the same result as together:

simplify(f2)
../_images/Appendix-Maths_73_0.png

Note that simplify usually combines fractions but does not factor.

In addition to algebraic manipulations, the other main use of CAS is to do calculus, like derivatives and integrals of algebraic expressions.

Differentiation#

Differentiation is usually simple. Use the diff function. The first argument is the expression to take the derivative of, and the second argument is the symbol by which to take the derivative:

diff(MyFun_new, a)
../_images/Appendix-Maths_78_0.png

For higher order derivatives we can do:

diff(MyFun_new, a, a)
../_images/Appendix-Maths_80_0.png
diff(MyFun_new**2, a, 2) # same as above
../_images/Appendix-Maths_81_0.png

You can directly apply another manipulation to the result of a previous operation:

expand(diff(MyFun_new**2, a, 2))
../_images/Appendix-Maths_83_0.png

Calculate the derivative of a multivariate expression:

x, y, z = var("x,y,z")
f = sin(x*y) + cos(y*z)

\(\frac{d^3f}{dxdy^2}\)

diff(f, x, 1, y, 2)
../_images/Appendix-Maths_88_0.png

Integration#

Integration is done in a similar fashion:

MyFun
../_images/Appendix-Maths_91_0.png
integrate(MyFun, x)
../_images/Appendix-Maths_92_0.png

By providing limits for the integration variable we can evaluate definite integrals:

integrate(MyFun, (x, -1, 1))
../_images/Appendix-Maths_94_0.png

and also improper integrals

integrate(exp(-x**2), (x, -oo, oo))
../_images/Appendix-Maths_96_0.png

Remember, oo is the SymPy notation for inifinity.

Sums and products#

You can evaluate sums and products using Sum. Note that this function is named Sum and not sum to avoid namespace conflict.

n = var("n")
Sum(1/n**2, (n, 1, 10))
../_images/Appendix-Maths_101_0.png
Sum(1/n**2, (n,1, 10)).evalf()
../_images/Appendix-Maths_102_0.png
Sum(1/n**2, (n, 1, oo)).evalf()
../_images/Appendix-Maths_103_0.png

Products work much the same way:

Product(n, (n, 1, 10)) # 10!
../_images/Appendix-Maths_105_0.png

Limits#

Limits can be evaluated using the limit function. For example,

limit(sin(x)/x, x, 0)
../_images/Appendix-Maths_108_0.png

We can use ‘limit’ to check the result of derivation using the diff function:

f
../_images/Appendix-Maths_110_0.png
diff(f, x)
../_images/Appendix-Maths_111_0.png

\(\displaystyle \frac{\mathrm{d}f(x,y)}{\mathrm{d}x} = \frac{f(x+h,y)-f(x,y)}{h}\)

h = var("h")
limit((f.subs(x, x+h) - f)/h, h, 0)
../_images/Appendix-Maths_114_0.png

OK!

We can change the direction from which we approach the limiting point using the dir keywork argument:

limit(1/x, x, 0, dir="+")
../_images/Appendix-Maths_117_0.png
limit(1/x, x, 0, dir="-")
../_images/Appendix-Maths_118_0.png

Series#

Series expansion is also one of the most useful features of a CAS. In SymPy we can perform a series expansion of an expression using the series function:

series(exp(x), x) # this is a classic!
../_images/Appendix-Maths_121_0.png

By default it expands the expression around \(x=0\), but we can expand around any value of \(x\) by explicitly include a value in the function call:

series(exp(x), x, 1)
../_images/Appendix-Maths_123_0.png

Or try:

series(log(x), x, 0) # will not work why?
../_images/Appendix-Maths_125_0.png
series(log(x), x,1) # this will work, however 
../_images/Appendix-Maths_126_0.png

And we can explicitly define to which order the series expansion should be carried out:

series(exp(x), x, 0, 3)
../_images/Appendix-Maths_128_0.png

Another way to do the same:

exp(x).series(x,0,3)
../_images/Appendix-Maths_130_0.png

The series expansion includes the order of the approximation, which is very useful for keeping track of the order of validity when we do calculations with series expansions of different orders:

s1 = cos(x).series(x, 0, 5); s1
../_images/Appendix-Maths_132_0.png
s2 = sin(x).series(x, 0, 2); s2
../_images/Appendix-Maths_133_0.png
expand(s1 * s2)
../_images/Appendix-Maths_134_0.png

If we want to get rid of the order information we can use the removeO method:

expand(s1.removeO() * s2.removeO())
../_images/Appendix-Maths_136_0.png

Matrix algebra#

Matrices are defined using the Matrix class:

m11, m12, m21, m22 = var("m11, m12, m21, m22")
b1, b2 = var("b1, b2")
A = Matrix([[m11, m12],[m21, m22]]) # Again, note: capital M for to avoid namespace conflict 
A
../_images/Appendix-Maths_140_0.png
b = Matrix([[b1], [b2]]); b
../_images/Appendix-Maths_141_0.png

With Matrix class instances we can do the usual matrix algebra operations:

A**2
../_images/Appendix-Maths_143_0.png
A * b
../_images/Appendix-Maths_144_0.png

And calculate determinants and inverses, and the like:

A.det()
../_images/Appendix-Maths_146_0.png
A.inv()
../_images/Appendix-Maths_147_0.png

Solving equations#

For solving equations and systems of equations we can use the solve function:

solve(x**2 - 1, x)
../_images/Appendix-Maths_150_0.png
solve(x**4 - x**2 - 1, x)
../_images/Appendix-Maths_151_0.png

System of equations:

solve([x + y - 1, x - y - 1], [x,y])
../_images/Appendix-Maths_153_0.png

In terms of other symbolic expressions:

solve([x + y - a, x - y - c], [x,y])
../_images/Appendix-Maths_155_0.png

You can also solve a single, or a system of ordinary differential equations (ODEs) using dsolve. WE will use this in a couple of the biological examples below.

Some biological examples#

Here are some examples of development and analysis of some fundamental mathematical models in biology.

One population: Exponential growth#

Let’s look at how populations grow when there are no environmental constraints. The differential equation model is:

(11)#\[\begin{equation}\label{eq:exp_growth} \frac{\text{d}N}{\text{d}t} = r_m N \end{equation}\]

where \(r_m\) is the intrinsic, constant rate of population gowth (units of 1/time), and \(N\) is population size (or biomass abundance). I use the subscript \(m\) in \(r_m\) to denote both Malthusian and maximal population growth rate because, in theory, without any constraints, this growth rate is expected to at its theoretical/biological maximum.

Let’s solve equation \ref{eq:exp_growth} so that we can calculate population size \(N_t\) at any given time \(t\) given a starting population size \(N_0\) at time 0.

First assign the symbolic variables:

r_max, N_0, K, t_lag, t  = var("r_max N_0 K t_lag t",real = True) # the real bit is not really necessary here
N_0 + (K - N_0) * exp(-exp(r_max * exp(1) * (t_lag - t)/((K - N_0) * log(10)) + 1))
r_m, N, t = var("r_m N t",real = True) # the real bit is not really necessary here

Now tell SymPy that \(N\) is a function:

N = Function('N')

Define \(N\) is a derivative of \(t\)

dN_dt = Derivative(N(t), t) - r_m*N(t); dN_dt
../_images/Appendix-Maths_166_0.png

Note that we have simply re-written the condition that LHS = RHS in eqn \ref{eq:exp_growth}. Now that we have the differential equation set up, we can solve it using dsolve. Since this is a simple ODE, Sympy can do it on its own with no hints or guesses (unlike more complex ODEs; see the documentation):

MyEq_sol = dsolve(dN_dt); MyEq_sol
../_images/Appendix-Maths_168_0.png

If you remember your high-school calculus, you might recall that \(C_1\) here is an arbitrary constant. We now need to re-express it in terms of the initial conditions. We can do so by setting \(t = 0\), and then setting that equal to \(N(0)\) at time 0:

\[C_1 = N_0\]

That is,

(12)#\[\begin{equation}\label{eq:exp_growth_sol} N{\left (t \right )} = N_0 e^{r_m t} \end{equation}\]

We could use Sympy for this last step as well (using subs() like you learned above), but that would be just plain silly – like using a sledge-hammer to drive in a nail!

We can now have a go at plotting the model (eqn. \ref{eq:exp_growth}), and also the solution eqn. \ref{eq:exp_growth_sol}. First, let’s get an approximate solution by using numerical integration (which you learnt in the Advanced Python chapter):

from scipy  import integrate

# parameters
r_m = 1.

# initial conditions
N_0 = 0.1

# The time vector
t_vec = sc.arange(0, 10., 0.01)

def exp_pop(N, t, r_m):
    """The right-hand side of the exponential growth ODE"""
    return r_m*N

N_vec = integrate.odeint(exp_pop, N_0, t_vec, args=(r_m,)) # the comma is needed!

# plot the numerical solution
p.plot(t_vec, N_vec)
p.xlabel('Time') ; p.ylabel('$N$') 

# plot analytical solution
p.plot(t_vec, N_0 * sc.exp(r_m * t_vec),'k--')
p.legend(['numerical approximation', 'analytical solution'], loc='best') # draw legend
<matplotlib.legend.Legend at 0x7fbaa240abe0>
../_images/Appendix-Maths_170_1.png

These look practically identical. But they are not:

N_vec - N_0 * sc.exp(r_m * t_vec)
array([[  0.00000000e+00,  -1.00501671e-03,  -2.02013400e-03, ...,
         -2.13744854e+03,  -2.15893125e+03,  -2.18062988e+03],
       [  1.00502283e-03,   6.11819401e-09,  -1.01511118e-03, ...,
         -2.13744753e+03,  -2.15893025e+03,  -2.18062887e+03],
       [  2.02014197e-03,   1.01512526e-03,   7.96376975e-09, ...,
         -2.13744651e+03,  -2.15892923e+03,  -2.18062786e+03],
       ..., 
       [  2.13744963e+03,   2.13744863e+03,   2.13744761e+03, ...,
          1.09563845e-03,  -2.14816243e+01,  -4.31802491e+01],
       [  2.15893236e+03,   2.15893136e+03,   2.15893034e+03, ...,
          2.14838268e+01,   1.10688877e-03,  -2.16975180e+01],
       [  2.18063100e+03,   2.18062999e+03,   2.18062898e+03, ...,
          4.31824631e+01,   2.16997432e+01,   1.11834951e-03]])

One population: Logistic Population growth#

Populations eventually run into contraints, even if they can grow exponentially at the start (eqn \ref{eq:exp_growth}). The classical model for logistic growth in population density (\(N\)) captures this dynamic:

(13)#\[ \frac{\text{d}N}{\text{d}t} = r_m N \left(1-\frac{N}{K}\right) \]

where \(K\) is the carrying capacity of the environment, while \(r_m\) is the same parameter same as above. Let’s solve this one as well.

As in the case of the exponential growth above, let’s find the solution to this equation for any arbitrary time point \(t\).

Again, first we define the vars and the function:

r_m, K, N, t = var("r_m K N t",real = True) # the real bit is not really necessary here

N = Function('N')
dN_dt = Derivative(N(t), t) - r_m * N(t) * (1 - N(t) / K); dN_dt
../_images/Appendix-Maths_175_0.png

Again, as in the exponential growth example, we have simply re-written the condition that LHS = RHS in eqn \ref{eq:logist_growth}. Now we can solve the ODE:

MyEq_sol = dsolve(dN_dt); MyEq_sol
../_images/Appendix-Maths_177_0.png

This is a bit more complicated than the solution for exponential growth above. But we can solve it the same way. First substitute \(t = 0\), and then solve the resulting equation for the initital condition of \(N_0\), which then gives This the time-dependent solution:

(14)#\[\begin{equation} N_t = \frac{N_0 K\mathrm{e}^{r_m t}}{K + N_0(\mathrm{e}^{r_m t}-1)} \end{equation}\]

You can do the last steps to obtain this solution using Sympy as well (I leave it to you to try it).

No let’s again compare the analytical solution against the numerical one:

from scipy  import integrate

# parameters
r_m = 1.
K = 10.
# initial condition
N_0 = 0.1

#The time vector 
t_vec = sc.arange(0, 10., 0.01)

def log_pop(N, t, r_m, K):
    """The right-hand side of the logistic ODE"""
    return r_m*N*(1-N/K)

N_vec = integrate.odeint(log_pop, N_0, t_vec, args=(r_m, K));

p.plot(t_vec, N_vec) # plot the solution
p.xlabel('Time') ; p.ylabel('$N$') 

# plot analytical solution
p.plot(t_vec, K * N_0 * sc.exp(r_m * t_vec)/(K + N_0 * (sc.exp(r_m * t_vec) - 1.)),'k--')
p.legend(['numerical approximation', 'analytical solution'], loc='best') # draw legend
<matplotlib.legend.Legend at 0x7fbaa02f1198>
../_images/Appendix-Maths_180_1.png

Two interacting populations: The Lotka-Volterra predator-prey model#

Now for the classical Lotka-Volterra model that you encountered in the advanced Python week (without logistic growth for the consumer, \(C\)).

(15)#\[\begin{align} \frac{dN}{dt} &= r_m N \left(1-\frac{N}{K}\right) - a N C\\ \frac{dC}{dt} &= e a N C - z C \end{align}\]

here \(r_m\) and \(K\) is the Resource’s growth rate and carrying capacity respectively as in the Logistic equation, \(a\) is the consumer’s search rate for the resource, \(e\) is consumer’s biomass conversion efficiency, and \(z\) is it’s mortality rate.

To solve this system of ODEs, we will take a different approach – We will solve for the equilibrum (steady state for the two species’ populations). Again, we start by define the vars:

r_m, a, e, z, K, N, C, t = var("r_m, a, e, z, K, N, C, t",real = True)

Now define the sysyem of ODEs for Sympy:

dN_dt = r_m * N *(1-N/K) - a * N * C
dC_dt = e * a * N * C - z * C

dC_dt, dN_dt
../_images/Appendix-Maths_184_0.png

Now define the equilibrium state:

N_eqlb = Eq(dN_dt, 0)
C_eqlb = Eq(dC_dt, 0)
N_eqlb, C_eqlb
../_images/Appendix-Maths_186_0.png

Solve it:

N_eqlb_sol = solve(N_eqlb, C)
C_eqlb_sol = solve(C_eqlb, N)

N_eqlb_sol, C_eqlb_sol
../_images/Appendix-Maths_188_0.png

So there is one equilibrium solution where both species maintain their non-zero populations. That was easy! Now you don’t need to guess what parameter values will give you coexistence (both populations remain > 0). Just substitute parameter combinations that satisfy the conditions that

\[ \frac{r_{m} \left(K - N\right)}{K a} > 0, \textrm{ and } \frac{z}{a e} > 0\]

As an exercise try plotting this exact solution for the steady state along with the one you would obtain (asymptotically) using numerical integration of the system of ODEs. Below is code that does the numerical integration (essentially, same as the LV.py script).

from scipy import integrate

t_vec = sc.arange(0, 100., 0.01)

# parameters
r_m = 1.
a = 1
e = 0.5
z = .5
K =10

# initial condition: this is an array now!
N0C0 = sc.array([1., 1.])

# the function still receives only `x`, but it will be an array, not a number
def LV(NC, t, r_m, K, a, e, z):
    # Unlike the esponental and logistic growth model, we now need to convert 
    # the output to a numpy array as it has two populations.
    return sc.array([ r_m * NC[0]*(1-NC[0]/K) - a * NC[0] * NC[1],
                   e * a * NC[0] * NC[1] - z * NC[1] ])

NC_vec = integrate.odeint(LV, N0C0, t_vec, (r_m, K, a, e, z))

Check NC_vec’s dimensions:

print(NC_vec.shape)
(10000, 2)

Now let’s plot the solution. But first, just for fun, let’s change the plot style:

print(p.style.available)
['bmh', 'seaborn-darkgrid', 'fast', 'seaborn-dark', 'seaborn-notebook', 'classic', 'seaborn', 'ggplot', 'seaborn-white', 'seaborn-pastel', 'seaborn-deep', 'seaborn-dark-palette', 'dark_background', 'seaborn-paper', 'grayscale', 'seaborn-ticks', '_classic_test', 'seaborn-whitegrid', 'Solarize_Light2', 'seaborn-talk', 'fivethirtyeight', 'seaborn-poster', 'seaborn-muted', 'seaborn-colorblind', 'seaborn-bright']
p.style.use('seaborn-darkgrid')
p.plot(t_vec, NC_vec)
p.xlabel('Time'); p.ylabel('Population size') # and of y-axis
p.legend(['Resource ($N$)', 'Consumer ($C$)'], loc='best')
<matplotlib.legend.Legend at 0x7fbaa0186198>
../_images/Appendix-Maths_196_1.png

An useful thing to do here is take a look at the phase space, that is, plot only the dependent variables, without respect to time:

p.plot(NC_vec[0,0], NC_vec[0,1], 'o')
print('Initial condition:', NC_vec[0])

p.plot(NC_vec[:,0], NC_vec[:,1])

#Another solution with a different initial condition:
#NC_vec2 = odeint(LV, [2., 4.], t_vec, (r_m, K, a, e, z))
#p.plot(NC_vec2[:,0], NC_vec2[:,1])
#p.plot(NC_vec2[0,0], NC_vec2[0,1], 'o')
#p.xlabel('Resource Population size'); p.ylabel('Consumer Population size') # and of y-axis
Initial condition: [ 1.  1.]
[<matplotlib.lines.Line2D at 0x7fbaa00ecd30>]
../_images/Appendix-Maths_198_2.png

Readings and Resources#