I/O Utils

The ioutils module contains functions for reading data from automatic plate readers. The different functions read the data files and generate a data table of type pandas.DataFrame which contains all the relevant data: the read from every well at every time point.

This data table is in a tidy data format, meaning that each row in the table contains a single measurement with the following values (as columns):

  • Time: in hours (mandatory)

  • OD: optical density which is a proxy for cell density (mandatory)

  • Well: as in the name of the well such as “A1” or “H12” (optional)

  • Row, Col: the row and column of the well in the plate (optional)

  • Strain: the name of the strain (optional)

  • Color: the color that should be given to graphs of the data from this well (optional)

Any other columns can also be provided (for example, Cycle Nr. and Temp. [°C] are provided by Tecan Infinity).

Example of a pandas.DataFrame generated using the ioutils module functions:

Time

Temp. [°C]

Cycle Nr.

Well

OD

Row

Col

Strain

Color

0.0

30.0

1.0

A1

0.109999999403954

A

1

G

#4daf4a

0.23244444444444445

30.3

2.0

A1

0.109899997711182

A

1

G

#4daf4a

0.46569444444444447

30.1

3.0

A1

0.110500000417233

A

1

G

#4daf4a

0.6981111111111112

30.1

4.0

A1

0.110500000417233

A

1

G

#4daf4a

0.9305555555555556

30.0

5.0

A1

0.111599996685982

A

1

G

#4daf4a

Plate template

Normally, the output of a plate reader doesn’t include information about the strain in each well. To integrate that information (as well as the colors that should be used for plotting the data from each well), you must provide a plate definition CSV file.

This plate template file is a table in which each row has four values: Row, Col, Strain, and Color. The Row and Col values define the wells; the Strain and Color values define the names of the strains and their respective colors (for plotting purposes). These template files can be created using the Plato web app, using Excel (save as .csv), or in any other way that is convinient to you.

Curveball is also shipped with some plate templates files - type curveball plate list in the command line for a list of the builtin plate templates:

> curveball plate --list
checkerboard.csv
checkerboard2.csv
DH5a-s12-TG1.csv
DH5a-TG1.csv
G-RG-R.csv
nine-strains.csv
six-strains.csv

Example of the first 5 rows of a plate template file:

Row

Col

Strain

Color

A

1

0

#ffffff

A

2

0

#ffffff

A

3

0

#ffffff

A

4

0

#ffffff

A

5

0

#ffffff

A full example can be viewed by typing curveball plate in the command line.

Members

curveball.ioutils.read_curveball_csv(filename, max_time=None, plate=None)[source]

Reads growth measurements from a Curveball csv (comma separated values) file.

Parameters
filenamestr

path to the file.

platepandas.DataFrame, optional

data frame representing a plate, usually generated by reading a CSV file generated by Plato.

Returns
pandas.DataFrame

Examples

>>> df = curveball.ioutils.read_curveball_csv("data/Tecan_210115.csv")
curveball.ioutils.read_sunrise_xlsx(filename, label='OD', max_time=None, plate=None)[source]

Reads growth measurements from a Tecan Sunrise Excel output file.

Parameters
filenamestr

pattern of the XLSX files to be read. Use * and ? in filename to read multiple files and parse them into a single data frame. label : str, optional

labelstr, optional

measurment name to use for the data in the file, defaults to OD.

max_timefloat, optional

maximal time in hours, defaults to infinity

platepandas.DataFrame, optional

data frame representing a plate, usually generated by reading a CSV file generated by Plato.

Returns
pandas.DataFrame

Data frame containing the columns:

  • Time (float, in hours)

  • OD (or the value of label, if given)

  • Well (str): the well name, usually a letter for the row and a number of the column.

  • Row (str): the letter corresponding to the well row.

  • Col (str): the number corresponding to the well column.

  • Filename (str): the filename from which this measurement was read.

  • Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.

  • Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

curveball.ioutils.read_tecan_mat(filename, time_label='tps', value_label='plate_mat', value_name='OD', plate_width=12, max_time=None, plate=None)[source]

Reads growth measurements from a Matlab file generated by a propriety script at the Pilpel lab.

Parameters
filenamestr

name of the XML file to be read. Use * and ? in filename to read multiple files and parse them into a single data frame.

time_labelstr, optional

name of the field used to store the time values, defaults to tps.

labelstr

name of the field used to store the OD values, defaults to plate_mat.

plate_widthint

width of the microwell in plate in number of wells, defaults to 12.

max_timefloat, optional

maximal time in hours, defaults to infinity

platepandas.DataFrame, optional

data frame representing a plate, usually generated by reading a CSV file generated by Plato.

Returns
pandas.DataFrame

Data frame containing the columns:

  • Time (float, in hours)

  • OD (float)

  • Well (str): the well name, usually a letter for the row and a number of the column.

  • Row (str): the letter corresponding to the well row.

  • Col (str): the number corresponding to the well column.

  • Filename (str): the filename from which this measurement was read.

  • Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.

  • Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

curveball.ioutils.read_tecan_xlsx(filename, label='OD', sheets=None, max_time=None, plate=None, PRINT=False)[source]

Reads growth measurements from a Tecan Infinity Excel output file.

Parameters
filenamestr

path to the file.

labelstr / sequence of str

a string or sequence of strings containing measurment names used as titles of the data tables in the file.

sheetslist, optional

list of sheet numbers, if known. Otherwise the function will try to all the sheets.

max_timefloat, optional

maximal time in hours, defaults to infinity

platepandas.DataFrame, optional

data frame representing a plate, usually generated by reading a CSV file generated by Plato.

Returns
pandas.DataFrame

Data frame containing the columns:

  • Time (float, in hours)

  • Temp. [°C] (float)

  • Cycle Nr. (int)

  • Well (str): the well name, usually a letter for the row and a number of the column.

  • Row (str): the letter corresponding to the well row.

  • Col (str): the number corresponding to the well column.

  • Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.

  • Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

There will also be a separate column for each label, and if there is more than one label, a separate Time and Temp. [°C] column for each label.

Raises
ValueError

if not data was parsed from the file.

Examples

>>> plate = pd.read_csv("plate_templates/G-RG-R.csv")
>>> df = curveball.ioutils.read_tecan_xlsx("data/Tecan_210115.xlsx", label=('OD','Green','Red'), max_time=12, plate=plate)
>>> df.shape
(8544, 9)
curveball.ioutils.read_tecan_xml(filename, label='OD', max_time=None, plate=None)[source]

Reads growth measurements from a Tecan Infinity XML output files.

Parameters
filenamestr

pattern of the XML files to be read. Use * and ? in filename to read multiple files and parse them into a single data frame.

labelstr, optional

measurment name used as Name in the measurement sections in the file, defaults to OD.

max_timefloat, optional

maximal time in hours, defaults to infinity

platepandas.DataFrame, optional

data frame representing a plate, usually generated by reading a CSV file generated by Plato.

Returns
pandas.DataFrame

Data frame containing the columns:

  • Time (float, in hours)

  • Well (str): the well name, usually a letter for the row and a number of the column.

  • Row (str): the letter corresponding to the well row.

  • Col (str): the number corresponding to the well column.

  • Filename (str): the filename from which this measurement was read.

  • Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.

  • Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

There will also be a separate column for the value of the label.

Notes

This function was adapted from choderalab/assaytools (licensed under LGPL).

Examples

>>> import zipfile
>>> with zipfile.ZipFile("data/20130211_dh.zip") as z:
    z.extractall("data/20130211_dh")
>>> plate = pd.read_csv("plate_templates/checkerboard.csv")
>>> df = curveball.ioutils.read_tecan_xlsx("data/20130211_dh/*.xml", 'OD', plate=plate)
>>> df.shape
(2016, 8)
curveball.ioutils.write_curveball_csv(df, filename)[source]

Reads growth measurements from a Curveball csv (comma separated values) file.

Parameters
dfpandas.DataFrame, optional

data frame to write

filenamestr

path to the output file