.. _experiments: Defining Experimental Details ============================= There are two places where you specify the experimental details for an analysis performed using lyman. General experimental parameters that are constant across a group of subjects go in an *experiment file* in your *lyman directory*. You also must define the paradigm information, which can be different for each subject. Both modes of specification are documented here. The Experiment File ------------------- Experiments are defined by creating a Python file in your lyman directory called ``.py``. This is just a regular Python file that gets imported as a module at runtime. There is a default set of parameters that will be overridden by the fields in this file; to see these defaults, run the following code:: In [1]: import lyman In [2]: print(lyman.default_experiment_parameters()) The namespace of the imported experiment file is used to update these values. Here is a description of each of these fields. Preprocessing Parameters ~~~~~~~~~~~~~~~~~~~~~~~~ .. glossary:: source_template A string that, relative to your *data directory*, provides a path to the raw functional files for your experiment. It should include the string formatting key ``{subject_id}`` in place of the subject id in the path. When you have multiple runs of data, replace the run number with a glob wildcard. For instance, your template might look like ``"{subject_id}/bold/func_*.nii.gz"``. whole_brain_template If your acquisition covered only part of the brain (e.g. you are doing a high-res study), you can also supply a single EPI image in the same slice prescription but with enough slices to cover the full brain. This image will be used to initialize the anatomical coregistration. This field provides a template path to this image, as with the ``source_template`` (although you specify only one whole-brain file). If you don't have such an image, just leave this out of the experiment file. fieldmap_template A template string, like the source template, that identifies images that can be used to compute a map of distortions in the field. This isn't actually meant to be a traditional "fieldmap" image, but rather multiple images with the same slice prescription as the function runs but with one or more images where the phase encoding direction is reversed. n_runs An integer with the number of functional runs in the experiment. TR A float with the repetition time of the acquisition in seconds. frames_to_toss An integer with the number of frames to trim from the beginning of the timeseries. fieldmap_pe Directions of phase encoding directions in the fieldmap images, if they exist. For example, ``["y", "-y"]`` means that the fieldmap images have two frames where the first has normal phase encoding on the y axis and the second has reversed phase encoding also on the y axis. temporal_interp A boolean that specifies whether slice-time correction should be performed. interleaved A string or boolean that specifies whether the data were acquired with an interleaved protocol. Only relevant if ``temporal_interp`` is True. The only valid string is ``"siemens"``, which indicates a Siemens interleaved sequence (slice order depends on number of slices). Set to ``True`` if you are using an interleaved acquisition on a GE scanner. coreg_init A string that is either ``"fsl"`` or ``"header"``. This controls how the boundary-based registration algorithm is initialized. Using ``"header"`` may give better performance for partial brain acquisitions, but only if the anatomicals were acquired in the same session as the functionals. slice_order A string that is either ``"up"`` or ``"down"``. This corresponds to the slice acquisition order, and is only relevant if ``temporal_interp`` is True. intensity_threshold A float specifying the threshold for intensity artifacts. Frames where the median signal intensity across the brain exceeds the grand run median by this many median absolute deviations will be excluded from the model. spike_threshold A float specifying the threshold for white noise spike artifacts, or ``None`` to avoid detecting spikes (a plot is produced in either case). This is in the same units as the *intensity_threshold*, but spikes are usually more extreme. Unlike the intensity_threshold, white noise spikes will be unidirectional, but whether they are going to be positive or negative deflections will depend on your scanner. Ask your physicist. motion_threshold A float specifying the threshold for motion artifacts. Frames where the total displacement (in mm) relative to the previous frame exceeds this number will be excluded from the model. wm_components Number of principle components to use when extracting nuisance timeseries data from deep white matter voxels. This provides additional confound information that can be added during the model fitting (using the ``confound_sources`` variable). smooth_fwhm A float with the smoothing kernel size for the volume-based SUSAN smoothing. Note that an unsmoothed version of the timeseries is always produced. hpf_cutoff A float with the cutoff time (in seconds) for the highpass filter. This value is used in both preprocessing and in the model. It can also be ``None`` to skip the highpass filter Model Parameters ~~~~~~~~~~~~~~~~ .. glossary:: design_name A string used to build the name of the file with paradigm information (see below). condition_names A list of strings with condition names. If this is absent or set to None, the sorted unique values in the ``condition`` field of the design file are used. Otherwise, the design matrix will include only the conditions named in this list (in the order provided here). regressor_file The name of a file containing information about other regressors to add to the timeseries model (see below). regressor_names A list of strings that can be used to select specific columns from the regressor file specified above. If None, all columns in the csv file are used. hrf_model A string corresponding to the name of the HRF model class. Currently only ``GammaDifferenceHRF`` is supported. temporal_deriv Boolean specifying whether a derivative regressor should be used in the model for each explanatory variable (these are considered regressors of no interest). confound_sources A list of strings that can include ``"motion"``, ``"wm"``, and ``"brain"``. This specifies what information should be added to the "confounds" component of the design matrix. ``"motion"`` are the six motion parameters obtained during realignment, ``"wm"`` are timeseries data from deep white matter with dimensionality reduced using PCA (controlled by the ``nuisance_components`` variable), and ``"brain"`` is the mean timecourse over the whole brain mask. remove_artifacts Boolean specifying whether indicator vectors should be added to the design matrix to identify (and thus remove) frames that are determined during preprocessing to contain artifacts. confound_pca A boolean specifying whether the dimensionality of the confound matrix should be reduced using PCA to include dimensions explaining 99% of the variance. hrf_params A dictionary with keyword arguments for the HRF model class. contrasts A list of tuples, with one entry per contrast. Each contrast is defined by the 3-tuple ``(, [], [])``. For instance, if you want to test the contrast of hard events vs. easy events, you would use ``("hard-easy", ["hard", "easy"], [1, -1])``. The conditions must be present in the design, but you do not have to include the names of any conditions not involved in the contrast. If you provided a list of condition names, baseline contrasts are automatically generated for each of these conditions and prepended to this list. Importantly, the contrast names end up in file paths for the analysis results, so you should avoid spaces. memory_request An integer with the number of gigabytes of memory to request for model workflow nodes that involve large memory computations. This only applies to submission through a ``qsub``-based distribution plugin. Group Analysis Parameters ~~~~~~~~~~~~~~~~~~~~~~~~~ .. glossary:: flame_mode A string indicating the type of inference that should be performed in the group model. Options are ``ols``, ``flame1``, and ``flame12``, for ordinary least squares, MAP mixed effects, and full MCMC mixed effects, respectively. cluster_zthresh A float indicating the threshold used to initially define clusters in the Z-stat maps during multiple comparisons correction. grf_pthresh A float indicating the *p* value threshold for thresholding the corrected Z-stat images. peak_distance A float specifying the minimum distance (in mm) between local minima when finding activation peaks. surf_name The name of a Freesurfer surface to plot group results on. surf_smooth Extent of spatial smoothing (in mm) to apply after sampling to the surface. sampling_range A 3-tuple of floats where where to start, stop and the size of the step (all in ``sampling_units``) when projecting data onto the white surface. This only applies to group analysis in fsaverage space. sampling_units A string that is either "frac" or "mm" that makes up part of the specification for projecting results onto the surface manifold (it determines the units of the ``sampling_range`` paramters). This only applies to group analysis in fsaverage space. sampling_method A string that is either "average", "max", or "point" that makes up part of the specification for projecting results onto the surface manifold (it determines how to summarize the samples obtained using ``sampling_range`` and ``sampling_method`` into a single value at each verex). This only applies to group analysis in fsaverage space. surf_corr_sign A string that is either "pos", "neg", or "abs" for the sign of the test to run. This only applies to group analysis in fsaverage space. The parameters that were present in this file at runtime will be saved with the other processing outputs in the preproc and model analysis directories (in a file called ``experiment_info.json``). Any comments in the docstring to this module will be included in this archive. Note that if you preprocess your data, change the experiment definition, and then run the model without rerunning preproc, the preprocessing parameters in this archived file will be inaccurate. .. _design: Detailed Design Information --------------------------- The design file ~~~~~~~~~~~~~~~ You also have to generate a file in ``csv`` format for each subject specifying what actually happened during the scan. This file should live at ``//design/.csv``, where ``design_name`` is specified in the experiment file. Each row in this file corresponds to an event, where the term "event" is used broadly and can mean a "block" in a block design experiment. At a minimum, the following fields need to be present in this file: .. glossary:: run 1-based index for the run number. condition A string with the condition name for the event. onset Onset time (in seconds) of the event. 0s is considered to be the onset of the first frame that is not trimmed (by ``frames_to_toss`` in the experiment file). For example, an extremely basic design might look like this:: run,condition,onset 1,easy,0 1,hard,12 2,easy,0 2,hard,12 Of course, you'll almost certainly want to write this file using `Pandas `_ and not by hand. This information can be augmented with the following fields: .. glossary:: duration Duration (in seconds) of the event. If duration is 0 (which is the default), it is assumed to be an "impulse". value A parametric value corresponding to the height of the response. The defualt value is 1. Additionally, other columns can be included with some parametric value for that event (e.g. reaction time). This information is not used in the timeseries model, but it can be used in decoding analyses to regress confounds out of the data. A more complete file that will result in the same design as the simple example above would read :: run,condition,onset,duration,value,rt 1,easy,0,0,1,0.894 1,hard,12,0,1,1.217 2,easy,0,0,1,0.993 2,hard,12,0,1,1.328 The regressors file ~~~~~~~~~~~~~~~~~~~ A secondary and optional way to add design information uses a ``regressor`` file. This file, like the ``design`` file, should be a ``csv`` and should live at ``//design/.csv``, where ``regressor_file`` is specified in the experiment file. The format is a csv where column names are regressor names and rows are observations of the regressors at each timepoint in the experiment. Additionally, the file must have a ``run`` column, specifying the 1-based run number for each observation. This information is not transformed when building the design matrix beyond de-meaning by run. This is intended to allow the use of, e.g., BOLD timeseries information extracted from seed ROIs for functional connectivity analyses. The regressors are considered elements "of interest" in the design matrix, can be included in contrasts, and contribute to the "main model" R^2 calculation. An example file for an experiment where each run has 3 TRs and the experimenter is interested in functional connectivity early visual areas might look like :: V1,V2,run 1.46,1.55,1 0.80,-0.37,1 -1.91,-1.01,1 -0.65,0.38,2 1.00,1.01,2 -0.88,-2.00,2 Each experiment can take information from at most one regressor file, but you can create multiple regressor files for different experiments. It is also possible to include all possible regressors in a single file and select the specific columns for each experiment using the ``regressor_names`` field in the experiment definition. Specifying Alternate Models --------------------------- You can fit several models to the same preprocessed data, which in lyman is called an *altmodel* or *alternate model*. To fit an alternate model, create an experiment file called ``-.py`` and execute ``run_fmri.py`` with the arguments ``-experiment -altmodel ``. This module "inherits" from the base experiment file, so you only need to include information if it differs from what was previously defined. Note that the experiment parser isn't smart enough to detect when an altmodel overrides parameters that affect preprocessing.