Likelihood

The likelihood module contains functions used to calculate and visualize the log-likelihood surfaces for models and data in Curveball. Functions that use growth data expect a pandas.DataFrame generated by the ioutils module. Functions that use often results of model fitting require lmfit.model.ModelResult objects.

Members

curveball.likelihood.loglik(t, y, y_sig, f, penalty=None, **params)[source]

Computes the log-likelihood of seeing the data given a model assuming normal distributed observation/measurement errors.

\[\log{L(y | \theta)} = -\frac{1}{2} \sum_i { \log{(2 \pi \sigma_{i}^{2})} + \frac{(y - f(t_i; \theta))^2}{\sigma_{i}^{2}} }\]

which is the log-likelihood of seeing the data points \(t_i, y_i\) with measurement error \(\sigma_i\) given the model function \(f\), the model parameters \(\theta\), and that the measurement error at time \(t_i\) has a normal distribution with mean 0.

tnp.ndarray

one dimensional array of time

ynp.ndarray

one dimensional array of the means of the observations

y_signp.ndarray

one dimensional array of standrad deviations of the observations

fcallable

a function the calculates the expected observations (f(t)) from t and any parameters in params

penaltycallable

a function that calculates a scalar penalty from the parameters in params to be substracted from the log-likelihood

paramsfloats, optional

model parameters

float

the log-likelihood result

curveball.likelihood.loglik_r_nu(r_range, nu_range, df, f=<function baranyi_roberts_function>, penalty=None, **params)[source]

Estimates the log-likelihood surface for \(r\) and \(\nu\) given data and a model function.

r_range, nu_rangenumpy.ndarray

vectors of floats of \(r\) and \(\nu\) values on which to compute the log-likelihood

dfpandas.DataFrame

data frame with Time and OD columns

fcallable, optional

model function, defaults to curveball.baranyi_roberts_model.baranyi_roberts_function()

penaltycallable, optional

if given, the result of penalty will be substracted from the log-likelihood for each parameter set

paramsfloats

values for the model model parameters used by f

np.ndarray

two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and nu_range[j]

loglik loglik_r_q0

curveball.likelihood.loglik_r_q0(r_range, q0_range, df, f=<function baranyi_roberts_function>, penalty=None, **params)[source]

Estimates the log-likelihood surface for \(r\) and \(\nu\) given data and a model function.

r_range, q0_rangenumpy.ndarray

vectors of floats of \(r\) and \(q_0\) values on which to compute the log-likelihood

dfpandas.DataFrame

data frame with Time and OD columns

fcallable, optional

model function, defaults to curveball.baranyi_roberts_model.baranyi_roberts_function()

penaltycallable, optional

if given, the result of penalty will be substracted from the log-likelihood for each parameter set

paramsfloats

values for the model model parameters used by f

np.ndarray

two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and q0_range[j]

loglik loglik_r_nu

curveball.likelihood.plot_loglik(Ls, xrange, yrange, xlabel=None, ylabel=None, columns=4, fig_title=None, normalize=True, ax_titles=None, cmap='viridis', colorbar=True, ax_width=4, ax_height=4, ax=None)[source]

Plots one or more log-likelihood surfaces.

Lssequence of numpy.ndarray

list or tuple of log-likelihood two-dimensional arrays; if one array is given it will be converted to a size 1 list

xrange, yrangenp.ndarray

values on x-axis and y-axis of the plot (rows and columns of Ls, respectively)

xlabel, ylabelstr, optional

strings for x and y labels

columnsint, optional

number of columns in case that Ls has more than one matrice

fig_titlestr, optional

a title for the whole figure

normalizebool, optional

if True, all matrices will be plotted using a single color scale

ax_titleslist or tuple of str, optional

titles corresponding to the different matrices in Ls

cmapstr. optional

name of a matplotlib colormap (to see list, call matplotlib.pyplot.colormaps()), defaults to viridis

colorbarbool, optional

if True a colorbar will be added to the plot

ax_width, ax_heightint

width and height of each panel (one for each matrice in Ls)

axmatplotlib axes or numpy.ndarray of axes

if given, will plot into ax, otherwise will create a new figure

figmatplotlib.figure.Figure

figure object

axnumpy.ndarray

array of axis objects

>>> L = loglik_r_nu(rs, nus, df, y0=y0, K=K, q0=q0, v=v)
>>> plot_loglik(L0, rs, nus, normalize=False, fig_title=fig_title, xlabel=r'$r$', ylabel=r'$\nu$', colorbar=False)
curveball.likelihood.plot_model_loglik(m, df, fig_title=None)[source]

Plot the log-ikelihood surfaces for \(\nu\) over \(r\) and \(q_0\) over \(r\) for given data and model fitting result.

mlmfit.model.ModelResult

model for which to plot the log-likelihood surface

dfpandas.DataFrame

data frame with Time and OD columns used to fit the model

fig_titlestr

title for the plot

figmatplotlib.figure.Figure

figure object

axnumpy.ndarray

array of axis objects

>>> m = curveball.models.fit_model(df)
>>> curveball.likelihood.plot_model_loglik(m, df)
curveball.likelihood.ridge_regularization(lam, **center)[source]

Create a penaly function that employs the ridge regularization method:

\[P = \lambda ||\theta - \theta_0||_2\]

where \(\lambda\) is the regularization scale, \(\theta\) is the model parameters vector, and \(\theta_0\) is the model parameters guess vector. This is similar to using a multivariate Gaussian prior distribution on the model parameters with the Gaussian centerd at \(\theta_0\) and scaled by \(\lambda\).

lamfloat

the penalty factor or regularization scale

centerfloats, optional

guesses of model parameters

callable

the penalty function, accepts model parameters as float keyword arguments and returns a float penalty to the log-likelihood

>>> penalty = ridge_regularization(1, y=0.1, K=1, r=1)
>>> loglik(t, y, y_sig, logistic, penalty=penalty, y0=0.12, K=0.98, r=1.1)