Managing Data Files

Revised: 10/21/07

To retrieve saved data for analysis, choose the file through File->Open….

The Open panel offers the option Load Raw Spike Signal from Data Files. Unless you are going to reanalyze spike trains you should leave this option unchecked (Expo will work faster). You can configure the default setting of this option through Expo->Preferences.

Expo provides several advanced capabilities for recovering programs from saved data, for exporting data, and for handling collections of data files and undertaking analyses on batches of them.

Recovering a Program

You can recover from any saved data the program used to collected it. To do this, choose Data->Recover Program. Expo will create a new Untitled program, which you can treat as you would any other new program.

Exporting Data in XML Documents

You can export the complete contents of a data set as an XML document. You might want to do this to undertake various kinds of analysis that can't be done with Expo's built-in functions, e.g.:

To correlate spike times with the display of particular image frames.

To do elaborate conditional filtering of data from passes through a running program.

To analyze raw spike waveforms to remove stimulus artifacts.

To analyze eye-movements represented in recorded analog signals.

XML documents are portable and can be read and parsed by analysis programs, including Python and Matlab. Julian Brown (julian@monkeybiz.stanford.edu) has written a suite of Matlab routines that can read and processes the XML files. These routines extract data, including events, analog signals, spike times, and the raw spike waveform, in structured form. You can then analyze data with the full power of Matlab.

To export Expo data as XML, choose File ->Export Data or Command-Shift-X. A panel will appear for you to choose the name and location of the file to be created. By default the name will be that of the source data file, with the extension .xml.

If the data include the raw spike waveform you can choose whether to write that (or a sample of it) to the XML file.

Unless you want to work with the raw spike waveform it advisable to omit it from the export because it makes the XML file much larger and will entail much longer processing time when the file is read by another program. (If you do export the raw spike waveform, and use Julian Brown's routines for analysis, it can be ignored during subsequent import by Matlab.)

The XML file contains complete information about the program, data, and environment settings such as timebase, monitor resolution, viewing distance, and analog sampling rate. The program resides in XML elements describing blocks, slots and routines. Data reside in a set of elements that describe passes through active states and events collected during them, and (if collected) spike times, spike waveforms, and continuously sampled analog signals. Spike waveforms and analog signals are encoded using Base64, which provides a text representation of binary data.

You can do bulk conversion of data files to XML with the AppleScript droplet EXPO-ExportXML.

Appendix F explains how to use the Matlab routines that read, decompose, and analyze the XML files.

Finding Recently Saved Data

If you are running programs and saving data, and have enabled automatic naming of data files (see Naming Data Files), Expo will maintain a list of all files saved for a particular root name. These names are added to the File->Recent Data submenu as you create them. Expo clears the list of files and begins it anew each time you change the root filename.

Getting Information

If you have a set of data files from, e.g., a set of physiological experiments on a single animal, or a set of psychophysical runs, Expo can provide summary information about the collection (the name of each program used to collect the data, date and time, microdrive position, etc.)

To obtain summary information choose File->Data File Info…. Expo will display a panel from which you can select the set of data files you want it to examine (to select multiple files you can shift-click, or choose Select All). To restrict the search to files made by a particular program provide at "For Program" a regular expression to match the names of any programs used to collect the data. Expo uses the standard syntax for regular expressions, so you should remember to escape any period (\.) you want to use as a literal character. If you provide no expression Expo examines data in all files selected. By default, Expo will order the summary list alphabetically by name of data file. If you check Sort by Program, Expo will order the list by the name of the program used to collect the data.

When Expo has extracted the relevant information from selected files it will display summary information in a text window. The window's title shows the path to the folder containing the files. To open any file double-click the line that describes it.

Making a File List

The Data File Info window provides summary information about data files, but does not make them readily accessible for analysis. If you want to work with a subset of readily accessible data files you can build a menu through which you can open a file simply by choosing its name.

To build a file list choose File->Data File List…. Expo will display the file panel used for Data File Info…, and you can use the procedure described above to specify the files you want to work with. By default, Expo builds the menu alphabetically by name of data file. To have items ordered by the name of the program used to collect the data, check Sort by Program.

When Expo has found all the files that match your specification it puts their names in the File->Matched Files submenu. Choosing any file from this menu will open it.

Batch Analysis

Expo allows you to set up an analysis that it will apply in a single operation to a set of data files, saving the results in files, or printing them. You can

Save tabular data in a file(s)
Print tabular data
Print graphs (with or without fitted curves)
Save curve-fit parameters in a file(s)
Print curve-fit parameters

To undertake a batch analysis you must first open a file that contains data of the sort you want to analyze, then open an analysis window and set up the formulas or load a template. If you want to fit curves to data and save or print curve-fit parameters, you should set up the curve-fit in the usual way (you needn't estimate parameters with Fit, but you should choose the function and set its initial parameter values via the curve sheet before clicking Done). If you want to analyze only a subset of the data collected in each file, set up the scope of the analysis with Scope…

Having set up the table you should select the columns for which you want the batch analysis run. Expo runs batch analyses only for formulas in selected columns.

Choose Data->Analyze Batch…. Expo will display a panel that allows you to choose the kind of analysis you want to undertake. If you have not set up curve-fitting parameters, the curve-fit options are dimmed. Click OK when you have chosen your analysis.

Saving Data in Excel Spreadsheets

If your chosen analysis requires an output file or files, Expo will display a panel through which you can specify the file(s) to receive output. By default Expo sends all output to a single file. If you want the results from each input file to be put in a separate output file, check Output in Multiple Files. Expo will then create a new output file for each input file; it names the output file by appending the text you provide at Save as: to the input name. Output files are created as Microsoft Excel (.xls) text documents with data in tab-separated columns.

When you have specified output files (if any) Expo will display a panel through which you can select the data files you want it to work through (to select multiple files, shift-click, or choose Select All). To restrict the analysis to files made by a particular program or programs provide at "For Program" a regular expression to match the names of any programs used to collect the data. Expo uses the standard syntax for regular expressions, so remember to escape a period (\.) if you want to use it as a literal character. If you provide no expression Expo examines data in all files in the batch selected.

While working its way through the batch, Expo will display a progress bar that shows how much of the analysis has been completed.

Data you save in .xls files can be opened as an Excel spreadsheet. They can also be readily imported into a Matlab. Appendix F explains how to organize this.

Format of Exported Data

By default, exported data are formatted as real numbers with whatever precision is needed to represent them. You can override this, and make Expo use whatever format is set for the relevant column in the Analysis Window, by choosing Expo->Preferences and checking Use Column Format for Exported Data on the Data Management tab.

Working with Multiple Subsets of Data

Before undertaking a batch analysis you can set the scope (via Scope… in the main analysis window), to restrict examination to a subset of the data.

When working with batches of files, Expo can analyze a series of segments of data from a single input file, putting the results of all the analyses in one output file, or printing them. Before making a multi-segment analysis, you must define the segment size by setting the scope in the analysis window that contains the master table. Click Scope… and specify the position and size of the first segment you want to be analyzed. For example, to start with the first pass, and to make the segment size a single pass, set First pass to analyze to 0 and Last pass to analyze to 0. To ensure that no pass will be skipped, set Analyze every 1 th pass.

When you start the batch analysis with Data->Analyze Batch…, check Iterate for All Data (the box will be dimmed if you have not restricted the scope to less than the full set of data). When Iterate for All Data is checked, Expo will analyze the first segment of data defined for each file, then will analyze succeeding segments of the size defined in the scope, until all data in the file are exhausted.