Specifying analysis parameters¶

To control the execution of the lyman processing workflows, it is necessary to provide information about different aspects of the dataset, image acquisition, and experimental design. This information is generally communicated through text files stored in different locations. They are documented here.

Scan information¶

There must be a file called scans.yaml saved in the lyman directory. It should contain specifiers for every subject, session, and run in the project. The expected structure of the file is complicated to explain, although it is fairly straightforward when you see an example. The file is essentially a set of nested dictionaries: a dictionary mapping subject names to a dictionary mapping session ids to a dictionary mapping experiment names to a list of run ids. That is, something like this:

subj01:
  sess01:
    exp_a: [run_1, run_2]
  sess02:
    exp_a: [run_1]
    exp_b: [run_1, run_2]
subj02:
  sess01:
    exp_b: [run_1, run_3]

Note that the session and run identifiers can be any string: instead of sess01 you could use a date and instead of run_1 you could use a time.

Project-level parameters¶

Information that is consistent for the entire project must be defined in a file named project.py that is present in the lyman directory. Note that this is a Python module that can define the following variables. Some parameters have default values that will be used if the variable is not present in the project file.

data_dir: The location where raw data is stored. Should be defined relative to the lyman_dir. (Defaults to ../data)
proc_dir: The location where lyman workflows will output persistent data. Should be defined relative to the lyman_dir. (Defaults to ../proc)
cache_dir: The location where lyman workflows will write intermediate files during execution. Should be defined relative to the lyman_dir. (Defaults to ../cache)
remove_cache: If True, delete the cache directory containing intermediate files after successful execution of the workflow. This behavior can be overridden at runtime by command-line arguments. (Defaults to True)
fm_template: A template string to identify session-specific fieldmap files. (Defaults to {session}_fieldmap_{encoding}.nii.gz)
ts_template: A template string to identify time series data files. (Defaults to {session}_{experiment}_{run}.nii.gz)
sb_template: A template string to identify reference volumes corresponding to each run of time series data. (Defaults to {session}_{experiment}_{run}_ref.nii.gz)
voxel_size: The voxel size to use for the functional template. (Defaults to (2, 2, 2))
phase_encoding: The phase encoding direction used in the functional acquisition. (Defaults to pa)
scan_info: Information about scanning sessions. (Automatically populated by reading the scans.yaml file).

Experiment-level parameters¶

Information that is consistent for an entire experiment must be defined in a file named <experiment>.py that is present in the lyman directory. Like the project file, this should be a Python module that defines the variables explained below. The experiment file can also define model parameters that are documented further below. In this case, every model associated with the experiment will use that value, unless it is overridden in the specific model file.

experiment_name: The name of the experiment. (Automatically populated from module name).
tr: The temporal resolution of the functional acquisition in seconds. (Defaults to 0.0)
crop_frames: The number of frames to remove from the beginning of each time series during preprocessing. (Defaults to 0)

Model-level parameters¶

Information that is specific to a particular model must be defined in a file named <experiment>-<model>.py that is present in the lyman directory. This is also a Python module that defines the variables listed below. Note that as explained above, model-level parameters that are defined in the experiment file will be used in all models associated with that experiment, although an experiment-level model parameter can be overridden in a specific model file.

model_name: The name of the model. (Automatically populated from module name).
task_model: If True, model the task using a design file matching the model name. (Defaults to True)
smooth_fwhm: The size of the Gaussian smoothing kernel for spatial filtering. (Defaults to None)
surface_smoothing: If True, filter cortical voxels using Gaussian weights computed along the surface mesh. (Defaults to True)
interpolate_noise: If True, identify locally noisy voxels and replace replace their values using interpolation during spatial filtering. Warning: this option is still being refined. (Defaults to False)
hpf_cutoff: The cutoff value (in seconds) for the temporal high-pass filter. (Defaults to None)
percent_change: If True, convert data to percent signal change units before model fit. (Defaults to False)
nuisance_components: Anatomical sources and number of components per source to include. (Defaults to {})
save_residuals: If True, write out an image with the residual time series in each voxel after model fitting. (Defaults to False)
hrf_derivative: If True, include the temporal derivative of the HRF model. (Defaults to True)
contrasts: Definitions for model parameter contrasts. Each item in the list should be a tuple with the fields: (1) the name of the contrast, (2) the names of the parameters included in the contrast, and (3) the weights to apply to the parameters. (Defaults to [])

Design information¶

Information that determines the structure of the design matrix must be defined in a file named <model>.csv that is present for each subject in the directory <data_dir>/<subject>/design/<model>.py. This should should be a CSV file that can be loaded into a pandas DataFrame. Each row in the file should correspond to an event that will be modeled.

The design file must have the following columns: session, run, onset, and condition. The session and run identifiers should correspond to the keys used in the scans.yaml file (see above). The onset information should define the time (in seconds, relative to the start of each run) that the event occurred. The condition column should be a string that defines the type of event; the unique condition values will become columns in the design matrix.

The design file may also have the following columns: duration and value. If duration information is present, it determines the duration of the modeled event (in seconds) before convolution with the hemodynamic response model. It defaults to 0, which specifies an impulse. If value information is present, it determines the amplitude of the modeled event before convolution with the hemodynamic response model. It defaults to 1.