====== transio (transparent file I/O access class) ======
===== Motivation =====
Neuroimaging datasets can easily reach very large sizes (e.g. running a study with 2 groups of 24 subjects each and a 3x2x2 within-subject design requires to store a total of at least 48 * 13 = 624 maps, if each of these maps fully covers the brain at a 3mm resolution this amounts to up to 200,000 voxels per map requiring 800,000 bytes of diskspace leading to a total of about 500MByte of required storage capacity).
Working with such large datasets can become difficult if several larger datasets must be "kept in memory" (e.g. time course data being to produce the regression result on top of the regression result), which is why transparent I/O access to files allows Matlab to only have those data in memory that are required for the current task in hand.
To simplify this concept, the class is highly integrated into the [[xff]] class, which allows binary data file to be read with "transio access" enabled. To do so, the following syntax can be used (globally) to switch on/off transio access:
% enable transio access for all arrays larger than 500k
xff(0, 'transiosize', 5e5);
% disable transio access
xff(0, 'transiosize', Inf);
% enabled transio access for VTCData elements only
xff(0, 'transiosize', 'vtc', 5e5);
% get current xff/transio configuration settings
xff_tio_config = xff(0, 'transiosize');
% restore configuration
xff(0, 'transiosize', xff_tio_config);
===== Requirements =====
Currently, the transio access is limited to the following conditions:
* data is stored at a known and fixed "position" in the file (which can be determined at run-time, but must remain the same while the object is used)
* size of accessed array does not change (other than regular arrays, transio access references do not allow changes in size at run-time)
* if complex indexing is performed, each index is only used once
===== Class reference ('help transio') =====
transio (Object Class)
FORMAT: tio_obj = transio(file, endian, class, offset, size);
Input fields:
file filename where data is stored in
endian endian type (e.g. 'le', or 'ieee-be')
class numerical class (e.g. 'uint16', 'single')
offset offset within file (in bytes)
size size of array in file
Output fields:
tio_obj transio object that supports the methods
- subsref : tio_obj(I) or tio_obj(IX, IY, IZ)
- subsasgn : same as subsref
- size : retrieving the array size
- end : for building relative indices
- display : showing information
Note 1: enlarging of existing files (if the array is the last element
in the file) can be done by adding a (class-independent) sixth
parameter to the call.
Note 2: both subsref and subsasgn will only work within the existing
limits; growing of the array as with normal MATLAB variables
is *NOT* supported--so tio_obj(:,:,ZI) = []; will *NOT* work!
===== Syntax overview =====
==== Creating a transio object ====
Creating a transio object is done by a call to the constructor (''@transio/transio'') of the class:
% create a transio object for access of data in a NII file)
niidata = transio('largedata.nii', 'le', 'single', 352, [81, 75, 75, 5780]);
If the file does not exist or is not "large enough" to accommodate the data (filesize < offset + typefactor * product-of-sizes), a sixth argument can be given to the constructor:
% create a transio object, allowing the underlying file to be grown
niidata = transio('largecopy.nii', 'le', 'single', 352, [81, 75, 75, 5780], true);
==== Accessing a transio object ====
In principle, data access works seemlessly, just as with a regular matlab variable...
% retrieving one volume of data
niivol = niidata(:, :, :, 1581);
% setting one volume of data
niidata(:, :, :, 2711) = newvol;
As this class is integrated into the [[xff]] class, it can be directly used, for instance to read only time courses of a VTC that fall within a mask:
% enable transio for VTC data
xff(0, 'transiosize', 'vtc', 5e5);
% load MSK and VTC
msk = xff('*.msk', 'Please select a mask file...');
vtc = xff('*.vtc', 'Please select a VTC file...');
% check objects
if isxff(msk, 'msk') && isxff(vtc, 'vtc')
% get mask indices
maski = find(msk.Mask(:));
% read only the data we need
maskedvtcdata = vtc.VTCData(:, maski);
end
% clear objects
clearxffobjects({msk, vtc});
The referencing of ''vtc.VTCData(:, maski);'' issues an overloaded call to [[@transio/subsref]], which then resolves the indices into file positions (handled fairly elegantly, with as little overhead as possible).
And the same syntax can also be used for write access "into a transio" object (reference).
==== Additional notes ====
Some of the more typical functions applied to numerical data (plus, minus, times, mtimes, etc.) have been overloaded so that transio objects can potentially be used in expressions in a formula. But as it might be more prudent to use a double-precision version for complex computations, the two functions [[@transio/double|double]] and [[@transio/single|single]] are implemented as well.