I/O Utils¶
The ioutils
module contains functions for reading data from automatic plate readers.
The different functions read the data files and generate a data table of type pandas.DataFrame
which contains all the relevant data: the read from every well at every time point.
This data table is in a tidy data format, meaning that each row in the table contains a single measurement with the following values (as columns):
Time
: in hours (mandatory)OD
: optical density which is a proxy for cell density (mandatory)Well
: as in the name of the well such as “A1” or “H12” (optional)Row
,Col
: the row and column of the well in the plate (optional)Strain
: the name of the strain (optional)Color
: the color that should be given to graphs of the data from this well (optional)
Any other columns can also be provided (for example, Cycle Nr.
and Temp. [°C]
are provided by Tecan Infinity).
Example of a pandas.DataFrame
generated using the ioutils
module functions:
Time |
Temp. [°C] |
Cycle Nr. |
Well |
OD |
Row |
Col |
Strain |
Color |
0.0 |
30.0 |
1.0 |
A1 |
0.109999999403954 |
A |
1 |
G |
#4daf4a |
0.23244444444444445 |
30.3 |
2.0 |
A1 |
0.109899997711182 |
A |
1 |
G |
#4daf4a |
0.46569444444444447 |
30.1 |
3.0 |
A1 |
0.110500000417233 |
A |
1 |
G |
#4daf4a |
0.6981111111111112 |
30.1 |
4.0 |
A1 |
0.110500000417233 |
A |
1 |
G |
#4daf4a |
0.9305555555555556 |
30.0 |
5.0 |
A1 |
0.111599996685982 |
A |
1 |
G |
#4daf4a |
Plate template¶
Normally, the output of a plate reader doesn’t include information about the strain in each well. To integrate that information (as well as the colors that should be used for plotting the data from each well), you must provide a plate definition CSV file.
This plate template file is a table in which each row has four values:
Row
, Col
, Strain
, and Color
.
The Row
and Col
values define the wells; the Strain
and Color
values
define the names of the strains and their respective colors (for plotting purposes).
These template files can be created using the
Plato web app, using Excel (save as .csv
),
or in any other way that is convinient to you.
Curveball is also shipped with some plate templates files -
type curveball plate list
in the command line
for a list of the builtin plate templates:
> curveball plate --list
checkerboard.csv
checkerboard2.csv
DH5a-s12-TG1.csv
DH5a-TG1.csv
G-RG-R.csv
nine-strains.csv
six-strains.csv
Example of the first 5 rows of a plate template file:
Row |
Col |
Strain |
Color |
A |
1 |
0 |
#ffffff |
A |
2 |
0 |
#ffffff |
A |
3 |
0 |
#ffffff |
A |
4 |
0 |
#ffffff |
A |
5 |
0 |
#ffffff |
A full example can be viewed by typing curveball plate
in the command line
.
Members¶
-
curveball.ioutils.
read_curveball_csv
(filename, max_time=None, plate=None)[source]¶ Reads growth measurements from a Curveball csv (comma separated values) file.
- filenamestr
path to the file.
- platepandas.DataFrame, optional
data frame representing a plate, usually generated by reading a CSV file generated by Plato.
pandas.DataFrame
>>> df = curveball.ioutils.read_curveball_csv("data/Tecan_210115.csv")
-
curveball.ioutils.
read_sunrise_xlsx
(filename, label='OD', max_time=None, plate=None)[source]¶ Reads growth measurements from a Tecan Sunrise Excel output file.
- filenamestr
pattern of the XLSX files to be read. Use * and ? in filename to read multiple files and parse them into a single data frame. label : str, optional
- labelstr, optional
measurment name to use for the data in the file, defaults to
OD
.- max_timefloat, optional
maximal time in hours, defaults to infinity
- platepandas.DataFrame, optional
data frame representing a plate, usually generated by reading a CSV file generated by Plato.
- pandas.DataFrame
Data frame containing the columns:
Time
(float
, in hours)OD
(or the value of label, if given)Well
(str
): the well name, usually a letter for the row and a number of the column.Row
(str
): the letter corresponding to the well row.Col
(str
): the number corresponding to the well column.Filename
(str
): the filename from which this measurement was read.Strain
(str
): if a plate was given, this is the strain name corresponding to the well from the plate.Color
(str
, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.
-
curveball.ioutils.
read_tecan_mat
(filename, time_label='tps', value_label='plate_mat', value_name='OD', plate_width=12, max_time=None, plate=None)[source]¶ Reads growth measurements from a Matlab file generated by a propriety script at the Pilpel lab.
- filenamestr
name of the XML file to be read. Use
*
and?
in filename to read multiple files and parse them into a single data frame.- time_labelstr, optional
name of the field used to store the time values, defaults to
tps
.- labelstr
name of the field used to store the OD values, defaults to
plate_mat
.- plate_widthint
width of the microwell in plate in number of wells, defaults to 12.
- max_timefloat, optional
maximal time in hours, defaults to infinity
- platepandas.DataFrame, optional
data frame representing a plate, usually generated by reading a CSV file generated by Plato.
- pandas.DataFrame
Data frame containing the columns:
Time
(float
, in hours)OD
(float
)Well
(str
): the well name, usually a letter for the row and a number of the column.Row
(str
): the letter corresponding to the well row.Col
(str
): the number corresponding to the well column.Filename
(str
): the filename from which this measurement was read.Strain
(str
): if a plate was given, this is the strain name corresponding to the well from the plate.Color
(str
, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.
-
curveball.ioutils.
read_tecan_xlsx
(filename, label='OD', sheets=None, max_time=None, plate=None, PRINT=False)[source]¶ Reads growth measurements from a Tecan Infinity Excel output file.
- filenamestr
path to the file.
- labelstr / sequence of str
a string or sequence of strings containing measurment names used as titles of the data tables in the file.
- sheetslist, optional
list of sheet numbers, if known. Otherwise the function will try to all the sheets.
- max_timefloat, optional
maximal time in hours, defaults to infinity
- platepandas.DataFrame, optional
data frame representing a plate, usually generated by reading a CSV file generated by Plato.
- pandas.DataFrame
Data frame containing the columns:
Time
(float
, in hours)Temp. [°C]
(float
)Cycle Nr.
(int
)Well
(str
): the well name, usually a letter for the row and a number of the column.Row
(str
): the letter corresponding to the well row.Col
(str
): the number corresponding to the well column.Strain
(str
): if a plate was given, this is the strain name corresponding to the well from the plate.Color
(str
, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.
There will also be a separate column for each label, and if there is more than one label, a separate Time and Temp. [°C] column for each label.
- ValueError
if not data was parsed from the file.
>>> plate = pd.read_csv("plate_templates/G-RG-R.csv") >>> df = curveball.ioutils.read_tecan_xlsx("data/Tecan_210115.xlsx", label=('OD','Green','Red'), max_time=12, plate=plate) >>> df.shape (8544, 9)
-
curveball.ioutils.
read_tecan_xml
(filename, label='OD', max_time=None, plate=None)[source]¶ Reads growth measurements from a Tecan Infinity XML output files.
- filenamestr
pattern of the XML files to be read. Use
*
and?
in filename to read multiple files and parse them into a single data frame.- labelstr, optional
measurment name used as
Name
in the measurement sections in the file, defaults toOD
.- max_timefloat, optional
maximal time in hours, defaults to infinity
- platepandas.DataFrame, optional
data frame representing a plate, usually generated by reading a CSV file generated by Plato.
- pandas.DataFrame
Data frame containing the columns:
Time
(float
, in hours)Well
(str
): the well name, usually a letter for the row and a number of the column.Row
(str
): the letter corresponding to the well row.Col
(str
): the number corresponding to the well column.Filename
(str
): the filename from which this measurement was read.Strain
(str
): if a plate was given, this is the strain name corresponding to the well from the plate.Color
(str
, hex format): if a plate was given, this is the strain color corresponding to the well from the plate.
There will also be a separate column for the value of the label.
>>> import zipfile >>> with zipfile.ZipFile("data/20130211_dh.zip") as z: z.extractall("data/20130211_dh") >>> plate = pd.read_csv("plate_templates/checkerboard.csv") >>> df = curveball.ioutils.read_tecan_xlsx("data/20130211_dh/*.xml", 'OD', plate=plate) >>> df.shape (2016, 8)
This function was adapted from choderalab/assaytools (licensed under LGPL).