Neuroimaging datasets can easily reach very large sizes (e.g. running a study with 2 groups of 24 subjects each and a 3x2x2 within-subject design requires to store a total of at least 48 * 13 = 624 maps, if each of these maps fully covers the brain at a 3mm resolution this amounts to up to 200,000 voxels per map requiring 800,000 bytes of diskspace leading to a total of about 500MByte of required storage capacity).
Working with such large datasets can become difficult if several larger datasets must be “kept in memory” (e.g. time course data being to produce the regression result on top of the regression result), which is why transparent I/O access to files allows Matlab to only have those data in memory that are required for the current task in hand.
To simplify this concept, the class is highly integrated into the xff class, which allows binary data file to be read with “transio access” enabled. To do so, the following syntax can be used (globally) to switch on/off transio access:
% enable transio access for all arrays larger than 500k xff(0, 'transiosize', 5e5); % disable transio access xff(0, 'transiosize', Inf); % enabled transio access for VTCData elements only xff(0, 'transiosize', 'vtc', 5e5); % get current xff/transio configuration settings xff_tio_config = xff(0, 'transiosize'); % restore configuration xff(0, 'transiosize', xff_tio_config);
Currently, the transio access is limited to the following conditions:
transio (Object Class) FORMAT: tio_obj = transio(file, endian, class, offset, size); Input fields: file filename where data is stored in endian endian type (e.g. 'le', or 'ieee-be') class numerical class (e.g. 'uint16', 'single') offset offset within file (in bytes) size size of array in file Output fields: tio_obj transio object that supports the methods - subsref : tio_obj(I) or tio_obj(IX, IY, IZ) - subsasgn : same as subsref - size : retrieving the array size - end : for building relative indices - display : showing information Note 1: enlarging of existing files (if the array is the last element in the file) can be done by adding a (class-independent) sixth parameter to the call. Note 2: both subsref and subsasgn will only work within the existing limits; growing of the array as with normal MATLAB variables is *NOT* supported--so tio_obj(:,:,ZI) = []; will *NOT* work!
Creating a transio object is done by a call to the constructor (@transio/transio
) of the class:
% create a transio object for access of data in a NII file) niidata = transio('largedata.nii', 'le', 'single', 352, [81, 75, 75, 5780]);
If the file does not exist or is not “large enough” to accommodate the data (filesize < offset + typefactor * product-of-sizes), a sixth argument can be given to the constructor:
% create a transio object, allowing the underlying file to be grown niidata = transio('largecopy.nii', 'le', 'single', 352, [81, 75, 75, 5780], true);
In principle, data access works seemlessly, just as with a regular matlab variable…
% retrieving one volume of data niivol = niidata(:, :, :, 1581); % setting one volume of data niidata(:, :, :, 2711) = newvol;
As this class is integrated into the xff class, it can be directly used, for instance to read only time courses of a VTC that fall within a mask:
% enable transio for VTC data xff(0, 'transiosize', 'vtc', 5e5); % load MSK and VTC msk = xff('*.msk', 'Please select a mask file...'); vtc = xff('*.vtc', 'Please select a VTC file...'); % check objects if isxff(msk, 'msk') && isxff(vtc, 'vtc') % get mask indices maski = find(msk.Mask(:)); % read only the data we need maskedvtcdata = vtc.VTCData(:, maski); end % clear objects clearxffobjects({msk, vtc});
The referencing of vtc.VTCData(:, maski);
issues an overloaded call to @transio/subsref, which then resolves the indices into file positions (handled fairly elegantly, with as little overhead as possible).
And the same syntax can also be used for write access “into a transio” object (reference).
Some of the more typical functions applied to numerical data (plus, minus, times, mtimes, etc.) have been overloaded so that transio objects can potentially be used in expressions in a formula. But as it might be more prudent to use a double-precision version for complex computations, the two functions double and single are implemented as well.