Module: LoadData

Load Data loads text or numerical data to be associated with images, and can also load images specified by file names.
This module loads a file that supplies text or numerical data associated with the images to be processed, e.g., sample names, plate names, well identifiers, or even a list of image filenames to be processed in the analysis run.

Disclaimer: Please note that the Input modues (i.e., Images, Metadata, NamesAndTypes and Groups) largely supercedes this module. However, old pipelines loaded into CellProfiler that contain this module will provide the option of preserving them; these pipelines will operate exactly as before.

The module currently reads files in CSV (comma-separated values) format. These files can be produced by saving a spreadsheet from Excel as "Windows Comma Separated Values" file format. The lines of the file represent the rows, and each field in a row is separated by a comma. Text values may be optionally enclosed by double quotes. The LoadData module uses the first row of the file as a header. The fields in this row provide the labels for each column of data. Subsequent rows provide the values for each image cycle.

There are many reasons why you might want to prepare a CSV file and load it via LoadData. Below, we describe how the column nomenclature allows for special functionality for some downstream modules:

Example CSV file:

Image_FileName_FITC,Image_PathName_FITC,Metadata_Plate,Titration_NaCl_uM
"04923_d1.tif","2009-07-08","P-12345",750
"51265_d1.tif","2009-07-09","P-12345",2750

After the first row of header information (the column names), the first image-specific row specifies the file, "2009-07-08/04923_d1.tif" for the FITC image (2009-07-08 is the name of the subfolder that contains the image, relative to the Default Input Folder). The plate metadata is "P-12345" and the NaCl titration used in the well is 750 uM. The second image-specific row has the values "2009-07-09/51265_d1.tif", "P-12345" and 2750 uM. The NaCl titration for the image is available for modules that use numeric metadata, such as CalculateStatistics; "Titration" will be the category and "NaCl_uM" will be the measurement.

Using metadata in LoadData

If you would like to use the metadata-specific settings, please see Help > General help > Using metadata in CellProfiler for more details on metadata usage and syntax. Briefly, LoadData can use metadata provided by the input CSV file for grouping similar images together for the analysis run and for metadata-specfic options in other modules; see the settings help for Group images by metadata and, if that setting is selected, Select metadata tags for grouping for details.

Using MetaXpress-acquired images in CellProfiler

To produce a CSV file containing image location and metadata from a MetaXpress imaging run, do the following:

For a GUI-based approach to performing this task, we suggest using Pipeline Pilot.

For more details on configuring CellProfiler (and LoadData in particular) for a LIMS environment, please see our wiki on the subject.

Available measurements

See also the Input modules, LoadImages and CalculateStatistics.

Settings:

Input data file location

Select the folder containing the CSV file to be loaded. You can choose among the following options which are common to all file input/output modules:

Elsewhere and the two sub-folder options all require you to enter an additional path name. You can use an absolute path (such as "C:\imagedir\image.tif" on a PC) or a relative path to specify the file location relative to a directory):

An additional option is the following:

Name of the file

Provide the file name of the CSV file containing the data.

Load images based on this data?

Select Yes to have LoadData load images using the Image_FileName field and the Image_PathName fields (the latter is optional).

Base image location

The parent (base) folder where images are located. If images are contained in subfolders, then the file you load with this module should contain a column with path names relative to the base image folder (see the general help for this module for more details). You can choose among the following options:

Process just a range of rows?

Select Yes if you want to process a subset of the rows in the CSV file. Rows are numbered starting at 1 (but do not count the header line). LoadData will process up to and including the end row.

Rows to process

(Used only if a range of rows is to be specified)
Enter the row numbers of the first and last row to be processed.

Group images by metadata?

Select Yes to break the image sets in an experiment into groups that can be processed by different nodes on a computing cluster. Each set of files that share your selected metadata tags will be processed together. See CreateBatchFiles for details on submitting a CellProfiler pipeline to a computing cluster for processing.

Select metadata tags for grouping

(Used only if images are to be grouped by metadata)
Select the tags by which you want to group the image files here. You can select multiple tags. For example, if a set of images had metadata for "Run", "Plate", "Well", and "Site", selecting Run and Plate will create groups containing images that share the same [Run,Plate] pair of tags.

Rescale intensities?

This option determines whether image metadata should be used to rescale the image's intensities. Some image formats save the maximum possible intensity value along with the pixel data. For instance, a microscope might acquire images using a 12-bit A/D converter which outputs intensity values between zero and 4095, but stores the values in a field that can take values up to 65535.

Select Yes to rescale the image intensity so that saturated values are rescaled to 1.0 by dividing all pixels in the image by the maximum possible intensity value.

Select No to ignore the image metadata and rescale the image to 0 – 1.0 by dividing by 255 or 65535, depending on the number of bits used to store the image.