Likelihood

The likelihood module contains functions used to calculate and visualize the log-likelihood surfaces for models and data in Curveball. Functions that use growth data expect a pandas.DataFrame generated by the ioutils module. Functions that use often results of model fitting require lmfit.model.ModelResult objects.

Members

curveball.likelihood.loglik(t, y, y_sig, f, penalty=None, **params)[source]

Computes the log-likelihood of seeing the data given a model assuming normal distributed observation/measurement errors.

\[\log{L(y | \theta)} = -\frac{1}{2} \sum_i { \log{(2 \pi \sigma_{i}^{2})} + \frac{(y - f(t_i; \theta))^2}{\sigma_{i}^{2}} }\]

which is the log-likelihood of seeing the data points \(t_i, y_i\) with measurement error \(\sigma_i\) given the model function \(f\), the model parameters \(\theta\), and that the measurement error at time \(t_i\) has a normal distribution with mean 0.

Parameters
tnp.ndarray

one dimensional array of time

ynp.ndarray

one dimensional array of the means of the observations

y_signp.ndarray

one dimensional array of standrad deviations of the observations

fcallable

a function the calculates the expected observations (f(t)) from t and any parameters in params

penaltycallable

a function that calculates a scalar penalty from the parameters in params to be substracted from the log-likelihood

paramsfloats, optional

model parameters

Returns
float

the log-likelihood result

curveball.likelihood.loglik_r_nu(r_range, nu_range, df, f=<function baranyi_roberts_function at 0x2ad1d7f82c80>, penalty=None, **params)[source]

Estimates the log-likelihood surface for \(r\) and \(\nu\) given data and a model function.

Parameters
r_range, nu_rangenumpy.ndarray

vectors of floats of \(r\) and \(\nu\) values on which to compute the log-likelihood

dfpandas.DataFrame

data frame with Time and OD columns

fcallable, optional

model function, defaults to curveball.baranyi_roberts_model.baranyi_roberts_function()

penaltycallable, optional

if given, the result of penalty will be substracted from the log-likelihood for each parameter set

paramsfloats

values for the model model parameters used by f

Returns
np.ndarray

two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and nu_range[j]

See also

loglik
loglik_r_q0
curveball.likelihood.loglik_r_q0(r_range, q0_range, df, f=<function baranyi_roberts_function at 0x2ad1d7f82c80>, penalty=None, **params)[source]

Estimates the log-likelihood surface for \(r\) and \(\nu\) given data and a model function.

Parameters
r_range, q0_rangenumpy.ndarray

vectors of floats of \(r\) and \(q_0\) values on which to compute the log-likelihood

dfpandas.DataFrame

data frame with Time and OD columns

fcallable, optional

model function, defaults to curveball.baranyi_roberts_model.baranyi_roberts_function()

penaltycallable, optional

if given, the result of penalty will be substracted from the log-likelihood for each parameter set

paramsfloats

values for the model model parameters used by f

Returns
np.ndarray

two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and q0_range[j]

See also

loglik
loglik_r_nu
curveball.likelihood.plot_loglik(Ls, xrange, yrange, xlabel=None, ylabel=None, columns=4, fig_title=None, normalize=True, ax_titles=None, cmap='viridis', colorbar=True, ax_width=4, ax_height=4, ax=None)[source]

Plots one or more log-likelihood surfaces.

Parameters
Lssequence of numpy.ndarray

list or tuple of log-likelihood two-dimensional arrays; if one array is given it will be converted to a size 1 list

xrange, yrangenp.ndarray

values on x-axis and y-axis of the plot (rows and columns of Ls, respectively)

xlabel, ylabelstr, optional

strings for x and y labels

columnsint, optional

number of columns in case that Ls has more than one matrice

fig_titlestr, optional

a title for the whole figure

normalizebool, optional

if True, all matrices will be plotted using a single color scale

ax_titleslist or tuple of str, optional

titles corresponding to the different matrices in Ls

cmapstr. optional

name of a matplotlib colormap (to see list, call matplotlib.pyplot.colormaps()), defaults to viridis

colorbarbool, optional

if True a colorbar will be added to the plot

ax_width, ax_heightint

width and height of each panel (one for each matrice in Ls)

axmatplotlib axes or numpy.ndarray of axes

if given, will plot into ax, otherwise will create a new figure

Returns
figmatplotlib.figure.Figure

figure object

axnumpy.ndarray

array of axis objects

Examples

>>> L = loglik_r_nu(rs, nus, df, y0=y0, K=K, q0=q0, v=v)
>>> plot_loglik(L0, rs, nus, normalize=False, fig_title=fig_title, xlabel=r'$r$', ylabel=r'$\nu$', colorbar=False)
curveball.likelihood.plot_model_loglik(m, df, fig_title=None)[source]

Plot the log-ikelihood surfaces for \(\nu\) over \(r\) and \(q_0\) over \(r\) for given data and model fitting result.

Parameters
mlmfit.model.ModelResult

model for which to plot the log-likelihood surface

dfpandas.DataFrame

data frame with Time and OD columns used to fit the model

fig_titlestr

title for the plot

Returns
figmatplotlib.figure.Figure

figure object

axnumpy.ndarray

array of axis objects

Examples

>>> m = curveball.models.fit_model(df)
>>> curveball.likelihood.plot_model_loglik(m, df)
curveball.likelihood.ridge_regularization(lam, **center)[source]

Create a penaly function that employs the ridge regularization method:

\[P = \lambda ||\theta - \theta_0||_2\]

where \(\lambda\) is the regularization scale, \(\theta\) is the model parameters vector, and \(\theta_0\) is the model parameters guess vector. This is similar to using a multivariate Gaussian prior distribution on the model parameters with the Gaussian centerd at \(\theta_0\) and scaled by \(\lambda\).

Parameters
lamfloat

the penalty factor or regularization scale

centerfloats, optional

guesses of model parameters

Returns
callable

the penalty function, accepts model parameters as float keyword arguments and returns a float penalty to the log-likelihood

Examples

>>> penalty = ridge_regularization(1, y=0.1, K=1, r=1)
>>> loglik(t, y, y_sig, logistic, penalty=penalty, y0=0.12, K=0.98, r=1.1)