Likelihood¶
The likelihood
module contains functions used to calculate and visualize
the log-likelihood surfaces for models and data in Curveball.
Functions that use growth data expect a pandas.DataFrame
generated by the ioutils
module.
Functions that use often results of model fitting require lmfit.model.ModelResult
objects.
Members¶
-
curveball.likelihood.
loglik
(t, y, y_sig, f, penalty=None, **params)[source]¶ Computes the log-likelihood of seeing the data given a model assuming normal distributed observation/measurement errors.
\[\log{L(y | \theta)} = -\frac{1}{2} \sum_i { \log{(2 \pi \sigma_{i}^{2})} + \frac{(y - f(t_i; \theta))^2}{\sigma_{i}^{2}} }\]which is the log-likelihood of seeing the data points \(t_i, y_i\) with measurement error \(\sigma_i\) given the model function \(f\), the model parameters \(\theta\), and that the measurement error at time \(t_i\) has a normal distribution with mean 0.
- tnp.ndarray
one dimensional array of time
- ynp.ndarray
one dimensional array of the means of the observations
- y_signp.ndarray
one dimensional array of standrad deviations of the observations
- fcallable
a function the calculates the expected observations (f(t)) from t and any parameters in params
- penaltycallable
a function that calculates a scalar penalty from the parameters in params to be substracted from the log-likelihood
- paramsfloats, optional
model parameters
- float
the log-likelihood result
-
curveball.likelihood.
loglik_r_nu
(r_range, nu_range, df, f=<function baranyi_roberts_function>, penalty=None, **params)[source]¶ Estimates the log-likelihood surface for \(r\) and \(\nu\) given data and a model function.
- r_range, nu_rangenumpy.ndarray
vectors of floats of \(r\) and \(\nu\) values on which to compute the log-likelihood
- dfpandas.DataFrame
data frame with Time and OD columns
- fcallable, optional
model function, defaults to
curveball.baranyi_roberts_model.baranyi_roberts_function()
- penaltycallable, optional
if given, the result of penalty will be substracted from the log-likelihood for each parameter set
- paramsfloats
values for the model model parameters used by f
- np.ndarray
two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and nu_range[j]
loglik loglik_r_q0
-
curveball.likelihood.
loglik_r_q0
(r_range, q0_range, df, f=<function baranyi_roberts_function>, penalty=None, **params)[source]¶ Estimates the log-likelihood surface for \(r\) and \(\nu\) given data and a model function.
- r_range, q0_rangenumpy.ndarray
vectors of floats of \(r\) and \(q_0\) values on which to compute the log-likelihood
- dfpandas.DataFrame
data frame with Time and OD columns
- fcallable, optional
model function, defaults to
curveball.baranyi_roberts_model.baranyi_roberts_function()
- penaltycallable, optional
if given, the result of penalty will be substracted from the log-likelihood for each parameter set
- paramsfloats
values for the model model parameters used by f
- np.ndarray
two-dimensional array of log-likelihood calculations; value at index i, j will have the value for r_range[i] and q0_range[j]
loglik loglik_r_nu
-
curveball.likelihood.
plot_loglik
(Ls, xrange, yrange, xlabel=None, ylabel=None, columns=4, fig_title=None, normalize=True, ax_titles=None, cmap='viridis', colorbar=True, ax_width=4, ax_height=4, ax=None)[source]¶ Plots one or more log-likelihood surfaces.
- Lssequence of numpy.ndarray
list or tuple of log-likelihood two-dimensional arrays; if one array is given it will be converted to a size 1 list
- xrange, yrangenp.ndarray
values on x-axis and y-axis of the plot (rows and columns of Ls, respectively)
- xlabel, ylabelstr, optional
strings for x and y labels
- columnsint, optional
number of columns in case that Ls has more than one matrice
- fig_titlestr, optional
a title for the whole figure
- normalizebool, optional
if
True
, all matrices will be plotted using a single color scale- ax_titleslist or tuple of str, optional
titles corresponding to the different matrices in Ls
- cmapstr. optional
name of a matplotlib colormap (to see list, call
matplotlib.pyplot.colormaps()
), defaults to viridis- colorbarbool, optional
if
True
a colorbar will be added to the plot- ax_width, ax_heightint
width and height of each panel (one for each matrice in Ls)
- axmatplotlib axes or numpy.ndarray of axes
if given, will plot into ax, otherwise will create a new figure
- figmatplotlib.figure.Figure
figure object
- axnumpy.ndarray
array of axis objects
>>> L = loglik_r_nu(rs, nus, df, y0=y0, K=K, q0=q0, v=v) >>> plot_loglik(L0, rs, nus, normalize=False, fig_title=fig_title, xlabel=r'$r$', ylabel=r'$\nu$', colorbar=False)
-
curveball.likelihood.
plot_model_loglik
(m, df, fig_title=None)[source]¶ Plot the log-ikelihood surfaces for \(\nu\) over \(r\) and \(q_0\) over \(r\) for given data and model fitting result.
- mlmfit.model.ModelResult
model for which to plot the log-likelihood surface
- dfpandas.DataFrame
data frame with Time and OD columns used to fit the model
- fig_titlestr
title for the plot
- figmatplotlib.figure.Figure
figure object
- axnumpy.ndarray
array of axis objects
>>> m = curveball.models.fit_model(df) >>> curveball.likelihood.plot_model_loglik(m, df)
-
curveball.likelihood.
ridge_regularization
(lam, **center)[source]¶ Create a penaly function that employs the ridge regularization method:
\[P = \lambda ||\theta - \theta_0||_2\]where \(\lambda\) is the regularization scale, \(\theta\) is the model parameters vector, and \(\theta_0\) is the model parameters guess vector. This is similar to using a multivariate Gaussian prior distribution on the model parameters with the Gaussian centerd at \(\theta_0\) and scaled by \(\lambda\).
- lamfloat
the penalty factor or regularization scale
- centerfloats, optional
guesses of model parameters
- callable
the penalty function, accepts model parameters as float keyword arguments and returns a float penalty to the log-likelihood
>>> penalty = ridge_regularization(1, y=0.1, K=1, r=1) >>> loglik(t, y, y_sig, logistic, penalty=penalty, y0=0.12, K=0.98, r=1.1)