Whenever a program is used for data analysis, it is important for the community at large to understand what algorithms were used in the analysis. And while NeuroElf is mostly written to make algorithms accessible (user friendliness aspect), it is equally relevant to ascertain that the methods implemented in any program have been accepted by the scientific community as “useful and reliable” (to achieve the intended goal) and are (as much as possible) free of errors, both when it comes to potential flaws in the algorithm as well as its specific implementation in the given program.
As an example, initially when using the alphasim button (NeuroElf GUI access to the alphasim.m
function) the GUI would demand user input for the estimated smoothness of the data, and as a default value 6mm
was presented to the user. This choice (of default value) was motivated by the fact that, at the lab where I work, the smoothing operation during the preprocessing stage would be configured with a 6mm Gaussian kernel. However, the correct number to use ought to be an estimate of the spatial smoothness of the residual, because that determines how likely it is that, by chance, a cluster of a given size will be encountered in a statistical map (at any given uncorrected threshold), and this issue has since been addressed!
The following list gives an overview on what methods of analysis and parameter estimation are implemented in NeuroElf (as far as they exceed basic operations, such as for example plain averaging across a dimension, or auxiliary functions that are used for string manipulation, file in-/output, or extended array operations, etc.):
Cluster size threshold estimation is a method that can be used to account for the fact that a regular whole-brain map is made up of multiple (partially) independent tests. One common way is to simply adapt the statistical threshold by dividing the desired false-positive rate (i.e. typically 5 per cent = 0.05) by (an estimate of) the number of independent tests. However, this can be too stringent in some cases where larger swaths of cortex (neurocomputational network nodes) respond to an experimental manipulation below the then required detection threshold. Instead of ensuring significance of results solely by applying a voxel-wise corrected statistical threshold it is possible to estimate how large clusters are, given the smoothness of the residual, that appear in a given search space at random. I.e. the alpha-rate (false positives among performed tests) can be estimated by simulating statistical maps of the desired kind and then selecting the appropriate cluster size threshold to ensure that at most 5 per cent of maps (with the residual exhibiting the same smoothness) would show a false positive cluster. The resulting pair of uncorrected statistical threshold and cluster size threshold together then correct a whole-brain map to a family-wise-error corrected threshold of desired strength (again usually 0.05). This algorithm is
alphasim.m
Cluster tables are often presented in publications describing analyses where whole-brain mapping was performed, i.e. the attempt in localizing the spatial nodes within cortex that subserve a specific function. This function is
clustervol.m
and a compiled MEX-file, coded in clustercoordsc.c
Once a (thresholded) map has been segregated into separate volumes (such that voxels of different clusters do not “touch” voxels of another cluster), clusters of considerable size (e.g. more than 100 voxels) sometimes exhibit “local maxima”, i.e. the spatial gradient becomes positive again from the overall maximum outwards after being negative in the beginning. To detect this, a 3D watershed algorithm has been implemented in the function splitclustercoords.m
.
A conjunction analysis can be informative when, across the brain, the overlap of two statistical tests is of interest. The most stringent test that can be applied is that of requiring that, in each considered voxel, both tests must be significant at the desired level. This functionality is
conjval
for statistics of the same kind and with the same D.F. parameter, i.e. higher value means greater significanceconjvalp
for p-values (and possibly other statistics for which lower values mean greater significance; also accepts negative values)Mediation analysis as a whole can be described as the estimation (and test) of separate path coefficients, a and b, as well as their product, a*b, such that the “transmission” of an existing effect between an indepedent/explanatory variable, X, and an outcome variable, Y, is accomplished via one or several mediators, Mi. The analysis includes a test for significance of the a*b product term (as well as the individual path coefficients), and also allows to specify covariates. It is
mediationpset.m
, where the pset indicates that the function returns path coefficients (p), standard errors (se), and t-statistics (t)An example would be, on the level of a between-subject effect, that a randomly assigned condition (X, e.g. strategy to apply to stimuli) has an effect on outcome (Y, e.g. appetite to a specific type of stimulus or difference in appetite to two kinds of stimuli) via a specific brain region (or network of regions) that work/s as a mediator/s (Mi, e.g. pre-frontal control regions). For a within-subjects design, a test could be whether, on any given trial, the response in pre-frontal cortex during an instructional cue (strategy stimulus) has an effect on outcome (self-reported craving for depicted food) via another brain region. In that case, either X (which brain regions has an influence on the “craving center” of the brain) or M (which brain region is influenced by the “control region” of the brain) could be “searched for”…
Multi-level kernel density analysis is trying to determine whether reported “peak coordinates” in previously published papers (given a selection criterion, such as publications concerned with a specific psychological construct, e.g. fear or working memory) occur in specific spatial locations (spatial specificity) significantly more often than warranted by chance, as a means to pool several publications to reduce the influence of a single publication on the “knowledge” of spatial distributions of activation patterns. It is
Ordinary least-squares (OLS) regression is the most generic way of applying the General Linear Model (GLM) so as to estimate “effect sizes”. Given the different applications, there are several functions implementing forms of this regression:
calcbetas.m
functionglmtstat.m
function must be usedrbalign.m (rigid-body alignment)
function that uses the GLM framework to estimate motion parameters
An additional small number of function files also perform some flavor of linear regression, but those are not applied to functional imaging data (e.g. the function regress_coords.m
can be used to determine the transformation required to minimize the error between two sets of coordinates after a rigid-body transform).
Robust regression, in NeuroElf, is the estimation of regression parameters using an iteratively-reweighted-least-squares approach where outliers are “detected” using the bi-square weighting function. It is
fitrobustbisquare.m
fitrobustbisquare_img.m
fitrobustbisquare_multi.m
robustt.m
can be used (correcting for the loss in degrees of freedom)robcorrcoef.m
should be used, which uses both as the explanatory variable as quick checkrobustnsamplet.m
and robustnsamplet_img.m
are available as well