====== Scripting of the processing stream (example) ======
===== Motivation =====
While for educational purposes it is quite useful to perform steps more manually, on a daily basis it is much more convenient (and less error prone) to have one or several scripts do the "work" and only take care of their configuration.
===== Requirements =====
For this example, the following programs must be installed and information available:
* bash, find, mkdir, mv, rsync, ssh, tar (come with most OSes that have a terminal or shell)
* Matlab v7.0 or higher, SPM5/SPM8, and NeuroElf v0.9a or higher
* location of subject data at the scanning facility (server)
* local folder and knowledge about desired structure
* onsets in a tabular structure where one column codes the subject and negative differences between onset times code the run separation (for an example, see below)
The scripts presented on this page can be downloaded (click on the filename in the tab above the script), and for bash scripts, please use the following command to make them executable:
chmod 755 SCANUnit_TASK_script.bash
To allow all users to execute the scripts, simply place them into a non-personal folder (e.g. /usr/local/bin).
For Matlab scripts, simply make sure they are located on your Matlab path.
===== Sub components =====
As part of this scripted approach requires the shell, there are multiple scripts. Part of the data handling is performed by shell commands (data retrieval, archiving) whereas the processing is implemented via SPM's job manager. The following steps are covered:
* data import (from a remote SSH-accessible server)
* correct renaming and creation of folders (into existing study structure)
* archiving (possibly configured as a regularly executed job via crontab)
* fMRI quality assessment (per run)
* preprocessing (SPM5 or SPM8)
* conversion into BrainVoyager QX's VTC format
* semi-automatic logfile parsing (PRT generation)
* computing a RFX-GLM of all available subjects (upon request)
===== Data import =====
#!/bin/bash
# settings ; PLEASE ADAPT
LOCAL_FOLDER=/Users/XXX/Desktop/CPU
REMOTE_FOLDER=/export/data/scanner/a_to_c/CPU
REMOTE_HOST=host.domain.columbia.edu
REMOTE_USER=xxx
# ensure there are arguments
if [[ "a$@" == "a" ]] ; then
echo "No subjects requested."
echo "USAGE (import one subject): $0 subject1"
echo "USAGE (import several subjects): $0 subject1 subject2 subject3 ..."
echo "USAGE (import several subjects, pattern based): $0 subj\*patt"
echo "---"
echo "This script uses ssh. You will be asked to enter"
echo "the password of user $REMOTE_USER on host $REMOTE_HOST."
echo "---"
echo "Data will be stored in $LOCAL_FOLDER."
exit 1;
fi
# change into local folder
cd $LOCAL_FOLDER
# retrieve requested files
ssh -l $REMOTE_USER REMOTE_HOST "cd $REMOTE_FOLDER ; tar -cf - $@" | tar -xf -
===== Folder naming and creation =====
#!/bin/bash
# settings ; PLEASE ADAPT
LOCAL_FOLDER=/Users/XXX/Desktop/CPU
SUBJ_PATTERN=CPU*
SUBFOLDERS="functional onsets structural"
RAWFOLDER=raw
# Note: the SUBFOLDERS variable can be easily adapted to create
# several levels of folders:
# SUBFOLDERS="functional functional/raw functional/prepro ..."
# for every folder found (matching the pattern)
find $LOCAL_FOLDER -type d -mindepth 1 -maxdepth 1 -name "$SUBJ_PATTERN" | while read subject_folder ; do
# make sure all subfolders we need exist
for sub in $SUBFOLDERS ; do
mkdir $subject_folder/$sub 2>/dev/null
done
# rename original folder, if existent, to raw
find $subject_folder -type d -mindepth 1 -maxdepth 1 -name "$SUBJ_PATTERN" -exec mv {} $RAWFOLDER \;
done
===== Archiving =====
For this, there is really no need for a script, simply let rsync do the work:
rsync -aut /Users/XXX/Desktop/CPU /Volumes/CPU/ImagingData
This can be easily added as a line to crontab (callable via ''crontab -e''):
30 2 * * * /usr/bin/rsync -aut /Users/XXX/Desktop/CPU /Volumes/CPU/ImagingData
To look up the help of crontab, enter ''man 5 crontab'' in the Terminal.
**Note: On MacOSX to ensure that the backuping volume is always available, add it to the user's startup items list in System Preferences, and store the password to that network volume in your keychain!**
===== DICOM import =====
This part of the scripted process is already covered at its own, proper page. See [[Processing stream - image file conversion|image file conversion]] for more information.
===== Preprocessing =====
Using the ''spm5_preprojobs'' function, we can easily setup a small script that preprocesses all (remaining) subjects:
% folder location
subjfolder = '/Users/Desktop/XXX/CPU';
subjpattern = 'CPU*';
funcfolder = 'functional';
anatfolder = 'struct';
% settings (for spm5_preprojobs)
funcpattern = 'r*340';
anatpattern = '';
usestruct = true;
runjobs = true;
disdaqs = 5;
smoothmm = 6;
sliceorder = 'aio';
tr = 2;
% complete settings
if ~isempty(funcpattern)
ffp = [funcfolder '/' funcpattern];
else
ffp = funcfolder;
end
if ~isempty(anatpattern)
anatpattern = [anatfolder '/' anatpattern];
else
anatpattern = anatfolder;
end
ppopts = struct( ...
'fun2str', usestruct, ...
'jobrun', runjobs, ...
'skip', disdaqs, ...
'smk', smoothmm, ...
'sto', sliceorder, ...
'sttr', tr);
% find subjects
subjects = findfiles(subjfolder, subjpattern, 'depth=1', 'dirs=1');
% for each subject
for sc = 1:numel(subjects)
% determine whether the subject needs processing
rps = findfiles([subjects{sc} '/' funcfolder], 'rp*.txt');
if ~isempty(rps)
continue;
end
% get subject ID
[basefld, subjid] = fileparts(subjects{sc});
subjid(subjid == '_') = [];
% run fMRI quality on all runs
runs = findfiles([subjects{sc} '/' funcfolder], funcpattern, 'dirs=1', 'depth=1');
for rc = 1:numel(runs)
q = fmriquality(findfiles(runs{rc}, {'*.img', '*.nii'}));
save(sprintf('%s/fmriquality.mat', runs{rc}), 'q');
clear q;
end
% run preprocessing
[j, jh, ppfiles] = spm5_preprojobs(subjects{sc}, ffp, anatpattern, ppopts);
% for each run, create one VTC
for rc = 1:numel(ppfiles)
[basefld, runid] = fileparts(fileparts(ppfiles{rc}{1}));
vtc = importvtcfromanalyze(ppfiles{rc});
vtc.TR = round(1000 * tr);
vtc.SaveAs(sprintf('%s/%s/%s_%s.vtc', subjects{sc}, funcfolder, subjid, runid);
vtc.ClearObject;
end
end
===== Semi-automatic logfile parsing and PRT creation =====
The onsets have supposedly already been transformed into a tabular structure (e.g. in Microsoft Excel), so that they look like this:
SUBJ_ID INST_on CUE1_on CUE2_on INST_of CUE1_of CUE2_of
20216 3 nan nan 5003 nan nan
20216 nan 3003 nan nan 18003 nan
20216 18000 nan nan 23000 nan nan
20216 nan nan 21000 nan nan 26000
20216 74057 nan nan 79057 nan nan
20216 nan 77057 nan nan 92057 nan
20216 92068 nan nan 97068 nan nan
20216 nan nan 95067 nan nan 100067
20216 148112 nan nan 153112 nan nan
20216 nan 151112 nan nan 166112 nan
20216 166122 nan nan 171122 nan nan
20216 nan nan 169122 nan nan 174122
20216 1 nan nan 5001 nan nan
20216 nan 3001 nan nan 18001 nan
20216 17999 nan nan 22999 nan nan
20216 nan nan 21011 nan nan 26011
20216 74056 nan nan 79056 nan nan
20216 nan 77056 nan nan 92056 nan
20216 92066 nan nan 97066 nan nan
20216 nan nan 95065 nan nan 100065
20216 148110 nan nan 153110 nan nan
20216 nan 151110 nan nan 166110 nan
20216 166120 nan nan 171120 nan nan
20216 nan nan 169120 nan nan 174120
In this table, the first column codes for the subject ID (which is used to detect when next subject occurs, in this sample there is only one subject!). The next three columns code the onset and the final three columns the offset of the three conditions used in this experiment (INST = Instruction, CUE1 = cue type 1, CUE2 = cue type 2).
The following script will then create a set of PRT files out of this table (which can be copied into Matlab):
% make sure the onset table is available in variable ot !!
% configuration
conds = {'INST', 'CUE1', 'CUE2'};
condcol = [255, 0, 0; 0, 255, 0; 0, 0, 255];
% find unique subject IDs
subjid = unique(ot(:, 1));
% for each subject ID
for sc = 1:numel(subjid)
% get this subject's sub-table
sot = ot(ot(:, 1) == subjid(sc), :);
% set NaNs to zero (for now)
sot(isnan(sot)) = 0;
% get onsets and offsets of any trial
onsets = sum(sot(:, 2:2+numel(conds)-1), 2);
% get table with NaNs (again) for condition detection
sot = ot(ot(:, 1) == subjid(sc), :);
% find run-separators
runseps = 1 + [0; find(diff(onsets) < 0)];
runseps(end+1) = numel(onsets) + 1;
% for each run
for rc = 1:numel(runseps)-1
% get part of table we need
rot = sot(runseps(rc):runseps(rc+1)-1, 2:end);
% create new PRT
prt = xff('new:prt');
% for each condition
for cc = 1:numel(conds)
% add to PRT
prt.AddCond(conds{cc}, rot(~isnan(rot(:, cc)), [cc, cc + numel(conds)]), condcol(cc, :));
end
% save PRT
prt.SaveAs(sprintf('%d_%d.prt', subjid(sc), rc));
% clear object
prt.ClearObject;
end
end
**Note: the PRT files will be written into the present working directory (unless the SaveAs line is appropriately altered), and they need to be moved or copied into their respective subject's folder!**
===== Computing an RFX-GLM with VTCs and PRTs =====
Once all files are in place, the Protocol and VTC files can be used to compute an RFX-GLM. Please follow the instructions given in the example of the [[mdm.ComputeGLM]] method reference page.