Likelihood

The likelihood module contains functions used to calculate and visualize the log-likelihood surfaces for models and data in Curveball. Functions that use growth data expect a pandas.DataFrame generated by the ioutils module. Functions that use often results of model fitting require lmfit.model.ModelResult objects.

Members

curveball.likelihood.loglik(t, y, y_sig, f, penalty=None, **params)[source]

Computes the log-likelihood of seeing the data given a model assuming normal distributed observation/measurement errors.

logL(y|θ)=12ilog(2πσ2i)+(yf(ti;θ))2σ2i

which is the log-likelihood of seeing the data points ti,yi with measurement error σi given the model function f, the model parameters θ, and that the measurement error at time ti has a normal distribution with mean 0.

tnp.ndarray

one dimensional array of time

ynp.ndarray

one dimensional array of the means of the observations

y_signp.ndarray

one dimensional array of standrad deviations of the observations

fcallable

a function the calculates the expected observations (f(t)) from t and any parameters in params

penaltycallable

a function that calculates a scalar penalty from the parameters in params to be substracted from the log-likelihood

paramsfloats, optional

model parameters

float

the log-likelihood result

curveball.likelihood.loglik_r_nu(r_range, nu_range, df, f=<function baranyi_roberts_function>, penalty=None, **params)[source]

Estimates the log-likelihood surface for r and ν given data and a model function.

r_range, nu_rangenumpy.ndarray

vectors of floats of r and ν values on which to compute the log-likelihood

dfpandas.DataFrame

data frame with Time and OD columns

fcallable, optional

model function, defaults to curveball.baranyi_roberts_model.baranyi_roberts_function()

penaltycallable, optional

if given, the result of penalty will be substracted from the log-likelihood for each parameter set

paramsfloats

values for the model model parameters used by f

np.ndarray

two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and nu_range[j]

loglik loglik_r_q0

curveball.likelihood.loglik_r_q0(r_range, q0_range, df, f=<function baranyi_roberts_function>, penalty=None, **params)[source]

Estimates the log-likelihood surface for r and ν given data and a model function.

r_range, q0_rangenumpy.ndarray

vectors of floats of r and q0 values on which to compute the log-likelihood

dfpandas.DataFrame

data frame with Time and OD columns

fcallable, optional

model function, defaults to curveball.baranyi_roberts_model.baranyi_roberts_function()

penaltycallable, optional

if given, the result of penalty will be substracted from the log-likelihood for each parameter set

paramsfloats

values for the model model parameters used by f

np.ndarray

two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and q0_range[j]

loglik loglik_r_nu

curveball.likelihood.plot_loglik(Ls, xrange, yrange, xlabel=None, ylabel=None, columns=4, fig_title=None, normalize=True, ax_titles=None, cmap='viridis', colorbar=True, ax_width=4, ax_height=4, ax=None)[source]

Plots one or more log-likelihood surfaces.

Lssequence of numpy.ndarray

list or tuple of log-likelihood two-dimensional arrays; if one array is given it will be converted to a size 1 list

xrange, yrangenp.ndarray

values on x-axis and y-axis of the plot (rows and columns of Ls, respectively)

xlabel, ylabelstr, optional

strings for x and y labels

columnsint, optional

number of columns in case that Ls has more than one matrice

fig_titlestr, optional

a title for the whole figure

normalizebool, optional

if True, all matrices will be plotted using a single color scale

ax_titleslist or tuple of str, optional

titles corresponding to the different matrices in Ls

cmapstr. optional

name of a matplotlib colormap (to see list, call matplotlib.pyplot.colormaps()), defaults to viridis

colorbarbool, optional

if True a colorbar will be added to the plot

ax_width, ax_heightint

width and height of each panel (one for each matrice in Ls)

axmatplotlib axes or numpy.ndarray of axes

if given, will plot into ax, otherwise will create a new figure

figmatplotlib.figure.Figure

figure object

axnumpy.ndarray

array of axis objects

>>> L = loglik_r_nu(rs, nus, df, y0=y0, K=K, q0=q0, v=v)
>>> plot_loglik(L0, rs, nus, normalize=False, fig_title=fig_title, xlabel=r'$r$', ylabel=r'$\nu$', colorbar=False)
curveball.likelihood.plot_model_loglik(m, df, fig_title=None)[source]

Plot the log-ikelihood surfaces for ν over r and q0 over r for given data and model fitting result.

mlmfit.model.ModelResult

model for which to plot the log-likelihood surface

dfpandas.DataFrame

data frame with Time and OD columns used to fit the model

fig_titlestr

title for the plot

figmatplotlib.figure.Figure

figure object

axnumpy.ndarray

array of axis objects

>>> m = curveball.models.fit_model(df)
>>> curveball.likelihood.plot_model_loglik(m, df)
curveball.likelihood.ridge_regularization(lam, **center)[source]

Create a penaly function that employs the ridge regularization method:

P=λ||θθ0||2

where λ is the regularization scale, θ is the model parameters vector, and θ0 is the model parameters guess vector. This is similar to using a multivariate Gaussian prior distribution on the model parameters with the Gaussian centerd at θ0 and scaled by λ.

lamfloat

the penalty factor or regularization scale

centerfloats, optional

guesses of model parameters

callable

the penalty function, accepts model parameters as float keyword arguments and returns a float penalty to the log-likelihood

>>> penalty = ridge_regularization(1, y=0.1, K=1, r=1)
>>> loglik(t, y, y_sig, logistic, penalty=penalty, y0=0.12, K=0.98, r=1.1)