I/O Utils¶

The ioutils module contains functions for reading data from automatic plate readers. The different functions read the data files and generate a data table of type pandas.DataFrame which contains all the relevant data: the read from every well at every time point.

This data table is in a tidy data format, meaning that each row in the table contains a single measurement with the following values (as columns):

Time: in hours (mandatory)
OD: optical density which is a proxy for cell density (mandatory)
Well: as in the name of the well such as “A1” or “H12” (optional)
Row, Col: the row and column of the well in the plate (optional)
Strain: the name of the strain (optional)
Color: the color that should be given to graphs of the data from this well (optional)

Any other columns can also be provided (for example, Cycle Nr. and Temp. [°C] are provided by Tecan Infinity).

Example of a pandas.DataFrame generated using the ioutils module functions:

Time	Temp. [°C]	Cycle Nr.	Well	OD	Row	Col	Strain	Color
0.0	30.0	1.0	A1	0.109999999403954	A	1	G	#4daf4a
0.23244444444444445	30.3	2.0	A1	0.109899997711182	A	1	G	#4daf4a
0.46569444444444447	30.1	3.0	A1	0.110500000417233	A	1	G	#4daf4a
0.6981111111111112	30.1	4.0	A1	0.110500000417233	A	1	G	#4daf4a
0.9305555555555556	30.0	5.0	A1	0.111599996685982	A	1	G	#4daf4a

Plate template¶

Normally, the output of a plate reader doesn’t include information about the strain in each well. To integrate that information (as well as the colors that should be used for plotting the data from each well), you must provide a plate definition CSV file.

This plate template file is a table in which each row has four values: Row, Col, Strain, and Color. The Row and Col values define the wells; the Strain and Color values define the names of the strains and their respective colors (for plotting purposes). These template files can be created using the Plato web app, using Excel (save as .csv), or in any other way that is convinient to you.

Curveball is also shipped with some plate templates files - type curveball plate list in the command line for a list of the builtin plate templates:

> curveball plate --list
checkerboard.csv
checkerboard2.csv
DH5a-s12-TG1.csv
DH5a-TG1.csv
G-RG-R.csv
nine-strains.csv
six-strains.csv

Example of the first 5 rows of a plate template file:

Row	Col	Strain	Color
A	1	0	#ffffff
A	2	0	#ffffff
A	3	0	#ffffff
A	4	0	#ffffff
A	5	0	#ffffff

A full example can be viewed by typing curveball plate in the command line.

Members¶

curveball.ioutils.read_curveball_csv(filename, max_time=None, plate=None)[source]¶

Reads growth measurements from a Curveball csv (comma separated values) file.

filenamestr: path to the file.
platepandas.DataFrame, optional: data frame representing a plate, usually generated by reading a CSV file generated by Plato.

pandas.DataFrame

>>> df = curveball.ioutils.read_curveball_csv("data/Tecan_210115.csv")

curveball.ioutils.read_sunrise_xlsx(filename, label='OD', max_time=None, plate=None)[source]¶

Reads growth measurements from a Tecan Sunrise Excel output file.

filenamestr: pattern of the XLSX files to be read. Use * and ? in filename to read multiple files and parse them into a single data frame. label : str, optional
labelstr, optional: measurment name to use for the data in the file, defaults to OD.
max_timefloat, optional: maximal time in hours, defaults to infinity
platepandas.DataFrame, optional: data frame representing a plate, usually generated by reading a CSV file generated by Plato.

pandas.DataFrame

Data frame containing the columns:

Time (float, in hours)
OD (or the value of label, if given)
Well (str): the well name, usually a letter for the row and a number of the column.
Row (str): the letter corresponding to the well row.
Col (str): the number corresponding to the well column.
Filename (str): the filename from which this measurement was read.
Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.
Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

curveball.ioutils.read_tecan_mat(filename, time_label='tps', value_label='plate_mat', value_name='OD', plate_width=12, max_time=None, plate=None)[source]¶

Reads growth measurements from a Matlab file generated by a propriety script at the Pilpel lab.

filenamestr: name of the XML file to be read. Use * and ? in filename to read multiple files and parse them into a single data frame.
time_labelstr, optional: name of the field used to store the time values, defaults to tps.
labelstr: name of the field used to store the OD values, defaults to plate_mat.
plate_widthint: width of the microwell in plate in number of wells, defaults to 12.
max_timefloat, optional: maximal time in hours, defaults to infinity
platepandas.DataFrame, optional: data frame representing a plate, usually generated by reading a CSV file generated by Plato.

pandas.DataFrame

Data frame containing the columns:

Time (float, in hours)
OD (float)
Well (str): the well name, usually a letter for the row and a number of the column.
Row (str): the letter corresponding to the well row.
Col (str): the number corresponding to the well column.
Filename (str): the filename from which this measurement was read.
Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.
Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

curveball.ioutils.read_tecan_xlsx(filename, label='OD', sheets=None, max_time=None, plate=None, PRINT=False)[source]¶

Reads growth measurements from a Tecan Infinity Excel output file.

filenamestr: path to the file.
labelstr / sequence of str: a string or sequence of strings containing measurment names used as titles of the data tables in the file.
sheetslist, optional: list of sheet numbers, if known. Otherwise the function will try to all the sheets.
max_timefloat, optional: maximal time in hours, defaults to infinity
platepandas.DataFrame, optional: data frame representing a plate, usually generated by reading a CSV file generated by Plato.

pandas.DataFrame

Data frame containing the columns:

Time (float, in hours)
Temp. [°C] (float)
Cycle Nr. (int)
Well (str): the well name, usually a letter for the row and a number of the column.
Row (str): the letter corresponding to the well row.
Col (str): the number corresponding to the well column.
Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.
Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

There will also be a separate column for each label, and if there is more than one label, a separate Time and Temp. [°C] column for each label.

ValueError: if not data was parsed from the file.

>>> plate = pd.read_csv("plate_templates/G-RG-R.csv")
>>> df = curveball.ioutils.read_tecan_xlsx("data/Tecan_210115.xlsx", label=('OD','Green','Red'), max_time=12, plate=plate)
>>> df.shape
(8544, 9)

curveball.ioutils.read_tecan_xml(filename, label='OD', max_time=None, plate=None)[source]¶

Reads growth measurements from a Tecan Infinity XML output files.

filenamestr: pattern of the XML files to be read. Use * and ? in filename to read multiple files and parse them into a single data frame.
labelstr, optional: measurment name used as Name in the measurement sections in the file, defaults to OD.
max_timefloat, optional: maximal time in hours, defaults to infinity
platepandas.DataFrame, optional: data frame representing a plate, usually generated by reading a CSV file generated by Plato.

pandas.DataFrame

Data frame containing the columns:

Time (float, in hours)
Well (str): the well name, usually a letter for the row and a number of the column.
Row (str): the letter corresponding to the well row.
Col (str): the number corresponding to the well column.
Filename (str): the filename from which this measurement was read.
Strain (str): if a plate was given, this is the strain name corresponding to the well from the plate.
Color (str, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.

There will also be a separate column for the value of the label.

>>> import zipfile
>>> with zipfile.ZipFile("data/20130211_dh.zip") as z:
    z.extractall("data/20130211_dh")
>>> plate = pd.read_csv("plate_templates/checkerboard.csv")
>>> df = curveball.ioutils.read_tecan_xlsx("data/20130211_dh/*.xml", 'OD', plate=plate)
>>> df.shape
(2016, 8)

This function was adapted from choderalab/assaytools (licensed under LGPL).

curveball.ioutils.write_curveball_csv(df, filename)[source]¶

Reads growth measurements from a Curveball csv (comma separated values) file.

dfpandas.DataFrame, optional: data frame to write
filenamestr: path to the output file