Tutorial

This Curveball tutorial walks through loading, processing, and analysing a read growth curve dataset.

About this tutorial

There is no better way to learn how to use a new tool than to see it applied in a real world situation. This tutorial will explain the workings of most of Curveball in the context of analysing a real growth curve dataset.

The data will be using is an Excel file (Tecan_280715.xlsx), the result of me growing two bacteria strains (DH5α, denoted by G and TG1, denoted by R) in a 96-well plate (Fig. 1) inside a Tecan Infinity plate reader over 17 hours at the Berman Lab in Tel-Aviv University.

https://d33wubrfki0l68.cloudfront.net/208d4035a83df3eb92fd194b9fd4e32262b4baa4/d6d06/_images/example_plot_plate.svg

Fig. 1 Plate template for the Tecan_280715 experiment, generated from the G-RG-R.csv plate template file. Green is for DH5α; Red is for TG1; Blue is for wells with both strains; White is for blank wells.

This tutorial assumes you are comfortable in the command line, but does not assume any prior experience doing data processing or analysis or with Python programming.

To follow the tutorial, go ahead and open a command line window (or a terminal).

Note

To open a command line (or terminal) in:

  • Windows: click the Start button, type cmd.exe and click Enter.

  • Linux: click Ctrl-T.

  • OS X: search for terminal in Spotlight.

Installing Curveball

Use the Installation instructions and check that Curveball was successfully installed:

>>> curveball --version
curveball, version x.x.x

where x.x.x will be replaced by the current version number (0.2.14+7.gd5152a8).

Getting the data

The dataset we will be using is packaged with Curveball. To find the path to Curveball:

>>> curveball --where
C:\Anaconda\lib\site-packages\curveball-0.2.14-py3.7.egg\curveball

Of course, the path might be different on your machine. From now on if you see CURVEBALL_PATH, replace that with the path you just got.

The data file resided at CURVEBALL_PATH/../data/Tecan_280715.xlsx. Let’s check it’s there. On Windows:

>>> dir CURVEBALL_PATH\..\data\Tecan_280715.xlsx /B
Tecan_280715.xlsx

On Linux and OS X:

>>> ls CURVEBALL_PATH/../data/Tecan_280715.xlsx
CURVEBALL_PATH/../data/Tecan_280715.xlsx

If you see File Not Found or No such file or directory then something went wrong; start again.

Now let’s create a new folder and copy the data file to it. On Windows:

>>> mkdir curveball-tutorial
>>> cd curveball-tutorial
>>> copy PATH\..\data\Tecan_280715.xlsx .
1 file(s) copied.

On Linux and OS X:

>>> mkdir curveball-tutorial
>>> cd curveball-tutorial
>>> cp PATH/../data/Tecan_280715.xlsx .

Analysing the data

Now we can proceed to analyse the data using Curveball.

For this, we will use the curveball analyse command:

>>> curveball analyse Tecan_280715.xlsx --plate_file=G-RG-R.csv --ref_strain=G

This command will:

  • Load the data from the file

  • Fit growth models to the data separately for each strain

  • Select the best model fit for each strain

  • Use the best model fits to simulate a competition between the strains

  • Infer the fitness of the strains from the simulated competition

Note

Some interesting options we used:

  • --plate_file: sets the plate template file to be G-RG-R.csv (Fig. 1). Plate template files can be generated with Plato.

  • --ref_strain: sets the green strain (G) to be the reference strain when infering fitness; i.e., the fitness of G is set to 1 and other strains are compared to it.

It will result in the creation of several figures (in .png files):

https://d33wubrfki0l68.cloudfront.net/3e2808ea256f36f67c2d77f450af41dd3de7950a/c2442/_images/tecan_280715_wells.png

Fig. 2 showing the growth curve in each well of the plate.

https://d33wubrfki0l68.cloudfront.net/001a69ba4062121debcd9d92922866197def7974/3aa52/_images/tecan_280715_strains.png

Fig. 3 showing the mean curve of each strain.

https://d33wubrfki0l68.cloudfront.net/18bdb3793c81eef63cf6aa1302ba211e74d7bc38/2182f/_images/tecan_280715_strain_g.png

Fig. 4 showing the model fitting and selection plot of strain G.

https://d33wubrfki0l68.cloudfront.net/8065b587bb5c1bddd625e760be727f9b87a19f89/3e426/_images/tecan_280715_r_vs_g.png

Fig. 5 showing the results of the simulated competition.

Also, it prints out a table that contains a summary for each strain, including all the growth parameters estimated by Curveball.

Here is the summary table:

CV(RMSD)

K

NRMSD

RMSD

RSS

aic

bic

filename

folder

has_lag

has_nu

lag

max_growth_rate

model

nu

q0

r

strain

v

w

weighted_aic

weighted_bic

y0

3.2268012165058684

0.47640620696972436

2.22728137310951

1.1555135384274589

4443.58399676388

974.0930416828226

998.533548890587

Tecan_280715

False

True

3.621736189170586

0.6117580972097533

Model(Richards)

1.0934960197113022

0

0.6589146816688329

G

0

1.0

0.7814714485883857

0.8867850675435222

0.005724935470917902

2.262879688701894

0.5936843103849805

1.5527889458266357

1.0563622809919289

3713.719422241823

376.9588434910281

401.39935069879255

Tecan_280715

False

True

2.7117955840198515

1.0196721284638344

Model(Richards)

0.3806627109531364

0

1.600870967897836

RG

0

1.5011151951501949

0.7857363772805844

0.9894944399297481

0.0016820365206082766

2.247844745610476

0.5652280054891252

1.4469797526720822

1.0157797857720747

3219.242748331465

109.69962160302045

133.88197472623455

Tecan_280715

False

True

2.452954294045998

1.2261256656077906

Model(Richards)

0.2623946225027076

0

2.3747696936882408

R

0

1.7677034405757033

0.7857457272152057

0.989150478549629

0.0008287187140247898

Note

We can run curveball again, this time with the -o summary.csv option, which will cause this table to be saved to a file named summary.csv instead of printing to the command line.

Additional commands and options

Let’s see which commands and options curveball supports:

>>> curveball --help
Usage: curveball-script.py [OPTIONS] COMMAND [ARGS]...
.
Options:
  -v, --verbose / -V, --no-verbose
  -l, --plot / -L, --no-plot
  -p, --prompt / -P, --no-prompt
  --where                         prints the path where Curveball is installed
  --version                       Show the version and exit.
  --help                          Show this message and exit.
.
Commands:
  analyse  Analyse growth curves using Curveball.
  plate    Read and output a plate from a plate file.

We’ve already seen --version, --where, and now --help. As for the other options:

  • --verbose allows us to get more information printed from curveball; this is useful for bug hunting when we don’t get the results we think we should get.

  • --no-plot turns off plotting; no plot files will be created, so curveball will finish faster.

  • --prompt turns on prompting; curveball will ask for confirmation, for example, when choosing the plate template file.

We can also list the options each command, such as analyse and plate, can get:

>>> curveball analyse --help
Usage: curveball-script.py analyse [OPTIONS] PATH
.
  Analyse growth curves using Curveball. Outputs estimated growth traits and
  fitness of all strains in all files in folder PATH or matching the pattern
  PATH.
.
Options:
  --max_time FLOAT            omit data after max_time hours
  --ref_strain TEXT           reference strain for competitions
  --blank_strain TEXT         blank strain for background calibration
  -o, --output_file FILENAME  output csv file path
  --plate_file TEXT           plate templates csv file
  --plate_folder PATH         plate templates default folder
  --help                      Show this message and exit.

Getting help

Please don’t hesitate to contact me (Yoav Ram) with any questions, comments, or suggestions: