Export To Database exports data directly to a database, or in database readable format, including an imported file with column names and a CellProfiler Analyst properties file, if desired
This module exports measurements directly to a database or to a SQL-compatible format. It allows you to create and import MySQL and associated data files into a database and gives you the option of creating a properties file for use with CellProfiler Analyst. Optionally, you can create an SQLite database file if you do not have a server on which to run MySQL itself.
This module must be run at the end of a pipeline, or second to last if you are using the CreateBatchFiles module. If you forget this module, you can also run the ExportDatabase data tool after processing is complete; its functionality is the same.
The database is set up with two primary tables. These tables are the Per_Image table and the Per_Object table (which may have a prefix if you specify). The Per_Image table consists of all the per-image measurements made during the pipeline, plus per-image population statistics (such as mean, median, and standard deviation) of the object measurements. There is one per_image row for every "cycle" that CellProfiler processes (a cycle is usually a single field of view, and a single cycle usually contains several image files, each representing a different channel of the same field of view). The Per_Object table contains all the measurements for individual objects. There is one row of object measurements per object identified. The two tables are connected with the primary key column ImageNumber, which indicates the image to which each object belongs. The Per_Object table has another primary key called ObjectNumber, which is unique to each image. Typically, if multiple types of objects are identified and measured in a pipeline, the numbers of those objects are equal to each other. For example, in most pipelines, each nucleus has exactly one cytoplasm, so the first row of the Per-Object table contains all of the information about object #1, including both nucleus- and cytoplasm-related measurements. If this one-to-one correspondence is not the case for all objects in the pipeline (for example, if dozens of speckles are identified and measured for each nucleus), then you must configure ExportToDatabase to export only objects that maintain the one-to-one correspondence (for example, export only Nucleus and Cytoplasm, but omit Speckles).
If you have extracted "Plate" and "Well" metadata from image filenames or loaded "Plate" and "Well" metadata via LoadData, you can ask CellProfiler to create a "Per_Well" table, which aggregates object measurements across wells. This option will output a SQL file (regardless of whether you choose to write directly to the database) that can be used to create the Per_Well table. At the secure shell where you normally log in to MySQL, type the following, replacing the italics with references to your database and files:
mysql -h hostname -u username -p databasename <pathtoimages/perwellsetupfile.SQL
The commands written by CellProfiler to create the Per_Well table will be executed.
Oracle is not fully supported at present; you can create your own Oracle DB using the .csv output option and writing a simple script to upload to the database.
Available measurements
For details on the nomenclature used by CellProfiler for the exported measurements, see
Help > General Help > How Measurements Are Named.
See also ExportToSpreadsheet.
Settings:
Database type
What type of database do you want to use?
- MySQL
allows the data to be written directly to a MySQL
database. MySQL is open-source software; you may require help from
your local Information Technology group to set up a database
server.
- MySQL / CSV writes a script file that
contains SQL statements for creating a database and uploading the
Per_Image and Per_Object tables. This option will write out the Per_Image
and Per_Object table data to two CSV files; you can use these files can be
used to import the data directly into an application
that accepts CSV data.
- SQLite writes
SQLite files directly. SQLite is simpler to set up than MySQL and
can more readily be run on your local computer rather than requiring a
database server. More information about SQLite can be found at
here.
Database name
Select a name for the database you want to use
SQL file prefix
(Used if SQL is selected as the database type and if CSV files are to be written)
What prefix do you want to use to name the SQL file?
Name the SQLite database file
(Used if SQLite selected as database type)
What is the SQLite database filename to which you want to write?
Add a prefix to table names?
Do you want to add a prefix to your table names?
This option enables you to prepend text to your table names
(Per_Image and Per_Object). CellProfiler will warn you before overwriting an existing table.
Table prefix
(Used if Add a prefix to table names? is selected)
What is the table prefix you want to use?
Create a CellProfiler Analyst properties file?
You can generate a template properties file that will allow you to use your new database with CellProfiler Analyst (a data
exploration tool which can also be downloaded from
http://www.cellprofiler.org/ ).
The module will attempt to fill in as many as the entries as possible
based on the pipeline's settings, including the
server name, username and password if MySQL or Oracle is used.
Enter an image url prepend if you plan to access your files via http
(Used only if creating a properties file)
The image paths written to the database will be the absolute
path the the image files on your computer. If you plan to make these
files accessible via the web, you can enter a url prefix here. Eg:
If an image is loaded from the path "/cellprofiler/images/" and you use
a url prepend of "http://mysite.com/", CellProfiler Analyst will look
for your file at "http://mysite.com/cellprofiler/images/"
If you are not using the web to access your files (i.e., they are locally
aceesible by your computer), leave this setting blank.
Select the plate type
(Used only if creating a properties file)
If you are using a multi-well plate or microarray, you can select the plate
type here. Supported types in CellProfiler Analyst are 96- and 384-well plates,
as well as 5600-spot microarrays. If you are not using a plate or microarray, select
None.
Select the plate metadata
(Used only if creating a properties file)
If you are using a multi-well plate or microarray, you can select the metadata corresponding
to the plate here. If there is no plate metadata associated with the image set, select
None.
Please see LoadImages, LoadData, or Help > Using CellProfiler > How Data Is Handled > Using Metadata In CellProfiler
for more details on obtaining, extracting, and using metadata tags from your images.
Select the well metadata
(Used only if creating a properties file)
If you are using a multi-well plate or microarray, you can select the metadata corresponding
to the well here. If there is no well metadata associated with the image set, select
None.
Please see LoadImages, LoadData, or Help > Using CellProfiler > How Data Is Handled > Using Metadata In CellProfiler
for more details on obtaining, extracting, and using metadata tags from your images.
Include information for all images, using default values?
(Used only if creating a properties file)
Check this setting to include information in the properties file for all images.
Leaving this box checked will do the following:
- All images loaded using LoadImages, LoadData or saved in SaveImages will be included.
- The CellProfiler image name will be used for the image_name field.
- A channel color listed in the image_channel_colors field will be assigned to the image by default order.
Leave this box unchecked to specify which images should be included or to override the automatic values.
Output file location
(Used only when saving csvs, or creating a properties file)
This setting determines where the .csv files are saved if
you decide to save measurements to files instead of writing them
directly to the database. If you request a CellProfiler Analyst properties file,
it will also be saved to this location.
You can choose among the following options which are common to all file input/output
modules:
- Default Input Folder: Use the default input folder.
- Default Output Folder: Use from the default output folder.
- Elsewhere...: Use a particular folder you specify.
- Default input directory sub-folder: Enter the name of a subfolder of
the default input folder or a path that starts from the default input folder.
- Default output directory sub-folder: Enter the name of a subfolder of
the default output folder or a path that starts from the default output folder.
Elsewhere and the two sub-folder options all require you to enter an additional
path name. You can use an absolute path (such as "C:\imagedir\image.tif" on a PC) or a
relative path to specify the file location relative to a directory):
- Use one period to represent the current directory. For example, if you choose
Default Input Folder sub-folder, you can enter "./MyFiles" to look in a
folder called "MyFiles" that is contained within the Default Input Folder.
- Use two periods ".." to move up one folder level. For example, if you choose
Default Input Folder sub-folder, you can enter "../MyFolder" to look in a
folder called "MyFolder" at the same level as the Default Input Folder.
For Elsewhere..., Default Input Folder sub-folder and
Default Output Folder sub-folder, if you have metadata associated with your
images via LoadImages or LoadData, you can name the folder using metadata
tags.
You can insert a previously defined metadata tag by either using:
- The insert key
- A right mouse button click inside the control
- In Windows, the Context menu key, which is between the Windows key and Ctrl key
The inserted metadata tag will appear in green. To change a previously inserted metadata tag, navigate the cursor to just before the tag and either:
- Use the up and down arrows to cycle through possible values.
- Right-click on the tag to display and select the available values.
For instance, if you have a metadata tag named
"Plate", you can create a per-plate folder by selecting one of the subfolder options
and then specifying the subfolder name with the "Plate" metadata tag.
The module will substitute the metadata values for the last image set
processed for any metadata tags in the folder name.
Please see
LoadImages,
LoadData, or
Help > Using CellProfiler > How Data Is Handled > Using Metadata In CellProfiler
for more details on obtaining, extracting, and using metadata tags from your images.
Calculate the per-image mean values of object measurements?
ExportToDatabase can calculate population statistics over all the objects in each image
and store the results in the database. For instance, if
you are measuring the area of the Nuclei objects and you check the box for this option,
ExportToDatabase will create a column in the Per_Image
table called "Mean_Nuclei_AreaShape_Area".
You may not want to use ExportToDatabase to calculate these population statistics if your pipeline generates
a large number of per-object measurements; doing so might exceed database
column limits. These columns can be created manually for selected measurements directly in MySQL.
For instance, the following SQL command creates the Mean_Nuclei_AreaShape_Area column:
ALTER TABLE Per_Image ADD (Mean_Nuclei_AreaShape_Area);
UPDATE Per_Image SET Mean_Nuclei_AreaShape_Area =
(SELECT AVG(Nuclei_AreaShape_Area)
FROM Per_Object
WHERE Per_Image.ImageNumber = Per_Object.ImageNumber);
Calculate the per-well mean values of object measurements?
ExportToDatabase can calculate statistics over all the objects in each well
and store the results as columns in a "per-well" table in the database. For instance,
if you are measuring the area of the Nuclei objects and you check the aggregate
mean box in this module,
ExportToDatabase will create a table in the database called
"Per_Well_avg", with a column called "Mean_Nuclei_AreaShape_Area". Selecting all three aggregate measurements will create three per-well tables, one for each of the measurements.
The per-well functionality will create the appropriate lines in a .SQL file, which can be run on your Per-Image and Per-Object tables to create the desired per-well table.
Note: this option is only
available if you have extracted plate and well metadata from the filename or via a LoadData module.
It will write out a .sql file with the statements necessary to create the Per_Well
table, regardless of the option chosen above.
Please see LoadImages, LoadData, or Help > Using CellProfiler > How Data Is Handled > Using Metadata In CellProfiler
for more details on obtaining, extracting, and using metadata tags from your images
Calculate the per-well median values of object measurements?
ExportToDatabase can calculate statistics over all the objects in each well
and store the results as columns in a "per-well" table in the database. For instance,
if you are measuring the area of the Nuclei objects and you check the aggregate
median box in this module,
ExportToDatabase will create a table in the database called
"Per_Well_median", with a column called "Median_Nuclei_AreaShape_Area". Selecting all three aggregate measurements will create three per-well tables, one for each of the measurements.
The per-well functionality will create the appropriate lines in a .SQL file, which can be run on your Per-Image and Per-Object tables to create the desired per-well table.
Note: this option is only
available if you have extracted plate and well metadata from the filename or via a LoadData module.
It will write out a .sql file with the statements necessary to create the Per_Well
table, regardless of the option chosen above.
Please see LoadImages, LoadData, or Help > Using CellProfiler > How Data Is Handled > Using Metadata In CellProfiler
for more details on obtaining, extracting, and using metadata tags from your images
Calculate the per-well standard deviation values of object measurements?
ExportToDatabase can calculate statistics over all the objects in each well
and store the results as columns in a "per-well" table in the database. For instance,
if you are measuring the area of the Nuclei objects and you check the aggregate
standard deviation box in this module,
ExportToDatabase will create a table in the database called
"Per_Well_std", with a column called "Mean_Nuclei_AreaShape_Area". Selecting all three aggregate measurements will create three per-well tables, one for each of the measurements.
The per-well functionality will create the appropriate lines in a .SQL file, which can be run on your Per-Image and Per-Object tables to create the desired per-well table.
Note: this option is only
available if you have extracted plate and well metadata from the filename or via a LoadData module.
It will write out a .sql file with the statements necessary to create the Per_Well
table, regardless of the option chosen above.
Please see LoadImages, LoadData, or Help > Using CellProfiler > How Data Is Handled > Using Metadata In CellProfiler
for more details on obtaining, extracting, and using metadata tags from your images
Export measurements for all objects to the database?
This option lets you choose the objects whose measurements will be saved in the Per_Object and Per_Well(s) database tables.
- All: Export measurements from all objects.
- None: Do not export data to a Per_Object table. Save only Per_Image or Per_Well measurements (which nonetheless include population statistics from objects).
- Select: Select the objects you want to export from a list.
Select the objects
(Used only if Select is chosen for adding objects)
Choose one or more objects from this list (click using shift or command keys to select multiple objects). The list includes
the objects that were created by prior modules. If you choose an
object, its measurements will be written out to the Per_Object and/or
Per_Well(s) tables, otherwise, the object's measurements will be skipped.
Create one table per object or a single object table?
ExportToDatabase can create either one table
for each type of object exported or a single
object table.
- One table per object type creates one
table for each object type you export. The table name will reflect
the name of your objects. The table will have one row for each
of your objects. You can write SQL queries that join tables using
the "Number_ObjectNumber" columns of parent objects (such as those
created by IdentifyPrimaryObjects) with the corresponding
"Parent_... column" of the child objects. Choose
One table per object type if parent objects can have more than
one child object, if you want a relational representation of
your objects in the database,
or if you need to split columns among different
tables and shorten column names because of database limitations.
- Single object table creates a single
database table that records all object measurements.
ExportToDatabase will prepend each column name with the
name of the object associated with that column's measurement.
Each row of the table will have measurements for all objects
that have the same image and object number. Choose
Single object table if parent objects have a single child,
or if you want a simple table structure in your database.
Maximum # of characters in a column name
This setting limits the number of characters that can appear
in the name of a field in the database. MySQL has a limit of 64
characters per field, but also has an overall limit on the number of characters
in all of the columns of a table. ExportToDatabase will
shorten all of the column names by removing characters, at the
same time guaranteeing that no two columns have the same name.
Write image thumbnails directly to the database?
(Used only if MySQL is selected as database type)
Check this option if you'd like to write image thumbnails directly
into the database. This will slow down the writing step, but will
enable new functionality in CellProfiler Analyst such as quickly
viewing images in the Plate Viewer tool by selecting "thumbnail"
from the "Well display" dropdown.
Select the images you want to save thumbnails of
(Used only if MySQL is selected as database type and writing thumbnails is selected)
Select the images that you wish to save as thumbnails to
the database.
Auto-scale thumbnail pixel intensities?
(Used only if MySQL is selected as database type and writing thumbnails is selected)
Check this option if you'd like to automatically rescale
the thumbnail pixel intensities to the range 0-1, where 0 is
black/unsaturated, and 1 is white/saturated.