The ASSURANCE project is about developing new stateoftheart algorithms for sparsityaware distributed models, mainly for Machine Learning in fMRI data decomposition/analysis and in telecommunications.
fMRISparse toolbox is a minimalistic collection of lowlevel data handling (matrix) functions for fMRI processing, blockbased & eventbased test pattern series, as well as "realistic" simulated fMRI data series for algorithm benchmarking, template scripts for various fMRI decomposition methods (GLM, PCA, ICA, BP, CCA, KSVD), analysis of components & activation maps, etc. Since the toolbox can be used as a benchmarking suite, several data generators are included for creating fullyidentifiable fMRIlike data series.
Currently, the toolbox implements the lowlevel functionality for fMRI data structures and uses some standard implementations for commonly used algorithms for analysis/decomposition and blind source separation, including principal component analysis, multivariate linear regression, basis pursuit and independent component analysis (see below: "Decomposition / analysis algorithms"). It also includes some extended wrapper functions for fMRIlike data series generation, based on existing similar code packages (e.g. from the MLSP lab, see below: "fMRI data series generation"), that can be used in benchmarking experiments. The plan is to provide frequent version updates of the toolbox, extending it with new implementations, performance enhancements and novel algorithms for fMRI data processing with special focus in sparse data modeling.
The main implementation platform is Matlab, currently using toolboxes from versions 8.0+ (releases R2012b and above), e.g. the Basis Pursuit algorithms from Wavelet Toolbox. However, with some small conversions in the source codes, the main functions should be fully runnable with other Matlabcompatible platforms like Octave (GNU). Also, there are some external codes and toolboxes required to run some methods (e.g. fastICA), which too can be substituted by other similar packages if necessary  see below for details.
Some features are not yet fully implemented. All code is under constant development and in beta version, so it should not be considered a 'stable' release.
What does the fMRISparse toolbox include?
Here is a list of the main features currently included in the toolbox:
fMRI data series manipulation:
 vxVectorize / vxDevectorize : A fMRI data series typically contains a 3D grid of several 2D 'slices' of brain fMRI 'images', registered along a 4th dimension of time (temporal evolution). However, most decomposition and analysis algorithms require standard 2D matrices. These two functions convert a 3D (slicebased) or 4D (gridbased) fMRI data series into a standard 'flattened' matrix and vice versa. These are compatible with Matlab's standard 'reshape' function, but they keep the rowwise axis for time and columnwise axis for voxels (while 'reshape' works differently).
fMRI voxel dependencies:
 vxBOLDtwogammaHRF : A standard, fullyparametric twoGamma implementation of the Haemodynamic Response Function (HRF) for BOLD registration in fMRI. It is typically used as a convolution kernel when generating the components of a design matrix in various decomposition methods, e.g. GLM or OMP. The implementation and default parameter values are based on the model proposed in: Jagath C. Rajapakse, Juan Zhou, Learning effective brain connectivity with dynamic Bayesian networks, NeuroImage 37 (2007) 749760.
 vxARMA : When generating artificial fMRI data series, various temporal and/or spatial dependencies between voxels need to be included. This function can be used as a postprocessing stage for 1D, 2D or 3D spatial convolution kernels, as well as in the 4th (time) dimension for temporal convolution kernels. It also includes options for adding noise elements in the final data.
Testpattern data series generators:
 vxBoxPatternCreate / vxBoxSeriesCreate : One of the most common cases of benchmarking data for fMRI decomposition algorithms is the use of 'box' patterns, with possible spatial and/or temporal overlapping. These two functions automate the generation procedures of such testpattern data series.
 vxSeriesBlockCreate / vxSeriesEventCreate : Create blockbased and eventbased fMRI activation patterns to be used in combination with any test patterns for the generation of corresponding data series.
fMRI postprocessing / various:
 vxCalcActivMap : After analyzing a fMRI data series into components, the 'sources' are correlated to known (taskrelated) activation patterns in order to identify 'activated' voxels in brain areas. This function is currently based on correlation in the temporal domain, since this is much faster, more robust and more generic than any other similar modelbased approach, e.g. performing statistical significance tests upon the 'beta' coefficients of a GLM model. However, the corresponding activation maps (based on pvalues) right now are much more inaccurate, since this feature is not yet fully implemented here. Instead, the correlation itself, either on the temporal ('fixed' versus 'discovered' ones) or the spatial domain (voxel runs with/without specific 'sources' included) should be used for the illustration of 'active' brain areas.
 signal_rescale : Generalpurpose function for rescaling a data series  e.g. used by 'vxBOLDtwogammaHRF'
 signal_white : Generalpurpose function for 'whitening' a data series, i.e., normalizing it for mean=0 and std=1.
 calcSparsity : Given a coefficient matrix (after decomposition), this function calculates the number of nonzero coefficients per 'signal'. On sparse coding stages (e.g. using BP), these numbers are the given sparsity constraint (T0).
fMRI data series generation:
 vxCreateSimData : Create a typical blockbased testpattern data series with (optionally) spatial and temporal overlapping between two 'box' activations. This can be modified to create any other type and shape of activation.
 vxCreateRealData : Create a "realistic" fMRIlike data series (realvalued) for use with any type of decomposition algorithm, for a single 'slice' and a total of 8 independent 'sources', one of which is the main taskrelated component to be identified. The generator optionally displays the complete run, illustrating the fullmixture and (only) the taskrelated activation map. The final output is the fullmixture run, i.e., a 2D image evolving through time, as well as the components (sources) and mixture matrices. The implementation is a rescalingenabled modular version of a similar script by the MLSP lab  for further details see: N. Correa, Y.O. Li, T. Adali, and V. Calhoun, Comparison of blind source separation algorithms for fMRI using a new Matlab toolbox: GIFT, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Philadelphia, PA, vol. 5, pp. 401404, March 2005.
 vxCreateComplexData : Very similar to 'vxCreateRealData' but with complexvalued elements that include randomized phase shifts. These data series are useful when the decomposition process is linked to blindsource separation and/or lowlevel compressive sensing, e.g. during the acquisition stage of fMRI (Kspace compression). The implementation is a rescalingenabled modular version of a similar script by the MLSP lab  for further details see: N. Correa, Y.O. Li, T. Adali, and V. Calhoun, Comparison of blind source separation algorithms for fMRI using a new Matlab toolbox: GIFT, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Philadelphia, PA, vol. 5, pp. 401404, March 2005.
Decomposition / analysis algorithms
The fMRISparse toolbox currently includes implementation/usage of four main decomposition algorithms for the analysis of fMRI data series:
Fixedcomponents methods:
 General Linear Model (GLM) : A standard regressionbased method for analyzing fMRI data series, employing multivariate linear regression against a predefined design matrix that usually contains HRFconvolutions of taskrelated and transient taskrelated activation patterns. In fMRISparse toolbox, this method is implemented by using Matlab's mvregress function from the Statistics toolbox.
 Canonical Correlation Analysis (CCA) : A standard correlationbased method for analyzing fMRI data series by calculating a pair of matrices A and B that maximize the correlation between the fMRI data series and a predefined design matrix that usually contains HRFconvolutions of taskrelated and transient taskrelated activation patterns. In fMRISparse toolbox, this method is implemented by using Matlab's canoncorr function from the Statistics toolbox.
 Matching Pursuit (MP) : One of the most popular algorithms for sparse representation and compressive sensing via redundant dictionaries. In fMRISparse toolbox, this method uses the Orthogonal Matching Pursuit (OMP) variant as it is implemented by Matlab's wmpalg function from the Wavelet toolbox. Since the current version includes only template scripts, instead of an overcomplete dictionary, a typical GLM design matrix is used here as well  this means that there is no real "selection" of appropriate components from the dictionary, only an optimal ordering and weighting of them. Furthermore, due to sparsity constraints, the final reconstruction quality in this case is deteriorated (more components are normally used). For real benchmarking tests, the wmpdictionary function can be used to create a proper redundant dictionary, containing an adequate number e.g. of wavelet components.
Nonfixedcomponents (blind) methods:
 Principal Compoments Analysis (PCA) : The standard SVDbased method for the decomposition of a data series into a (reduced) eigenspace. Instead of providing a predefined design matrix, here the activation patterns are "discovered" as well; however, due to its inherent sensitivity to noise, the corresponding spatial components are usually poisoned by "mirror" artifacts (complementary dependencies). In fMRISparse toolbox, this method is implemented by using Matlab's pca function from the Statistics toolbox.
 Independent Components Analysis (ICA) : Instead of just uncorrelated components, ICA analyzes the data series into independent components using higherorder statistical constraints. The two most popular approaches of ICA is FastICA and Infomax. If noise levels are low, the "discovered" components closely match the true "sources" that produce the signal, e.g. the corresponding internal "sources" of the mixture that is produced by vxCreateRealData (see above). In fMRISparse toolbox, this method is employed by using the FastICA toolbox for Matlab (see below for details) and can be substituted by any other similar implementation.
 Sparse Dictionary LearningDecomposition (KSVD) : KSVD is one of the most popular and wellstudied methods for data (matrix) decomposition under sparsity constraints, using a 'trained' dictionary of coding atoms. The method is a generalization of Kmeans clustering, essentially employing singular value decomposition (svd) techniques to iteratively pursuit optimal atoms formulation into the dictionary, while at the same time adjusting the corresponding regression coefficients. Thus, it is a 'blind' decomposition method, similar to PCA or ICA, but at the same time similar to classic signal transformations (e.g., DFT, DWT, DCT, ...) but with "learned" datadriven base functions. The KSVD analysis can work by applying constraints either on the final approximation error or on the required sparsity level on the coefficients. If dictionary size and sparsity level are chosen appropriately, the "learned" dictionary atoms closely match the true "sources" that produce the signal, e.g. the corresponding internal "sources" of the mixture that is produced by vxCreateRealData (see above). In fMRISparse toolbox, this method is employed by using the KSVD toolbox for Matlab (see below for details) and can be substituted by any other similar implementation
Main scripts  Testruns (demos)
Currently, there are two main scripts that can be used as code templates for similar procedures, i.e., how can fMRI be generated, analyzed and illustrated in various ways:
Testpattern fMRI data series decomposition scripts:
 run1_createblockseries : This is a code template for a typical blockbased testpattern series generation, analysis and presentation. The script does not actually include any real or simulated fMRI data series  this is used only for illustration purposes (data handling & processing) and to verify the proper use of decomposition algorithms, e.g. PCA. It uses the 'vxCreateSimData' function to generate a full fMRIlike blockpattern series, perform various decomposition procedures (PCA, ICA, GLM, BP), as well as HRF convolution on activation patterns and ARMAplusnoise postprocessing on the input data, and display full runs of generated and decomposed/reconstructed data series.
 run2_createRealSeries : This is a code template for a typical blockbased simulated fMRI data series generation, analysis and presentation.This script uses "realistic" realvalued fMRI data including a single 'slice', i.e., a 2D image evolving through time  this can be used for illustration purposes (data handling & processing), as well as to verify the proper use of decomposition algorithms, e.g. PCA or KSVD. It can also be used as a template for comparative testing of varius decomposition methods, with regard to the accuracy of identified components, reconstruction error, etc. It uses the 'vxCreateRealData' function to generate a full "realistic" oneslice fMRI data series, perform various decomposition procedures (PCA, ICA, GLM, BP, CCA, KSVD) and display full runs of generated and decomposed/reconstructed data series.
Downloads / Resources
The fMRISparse toolbox is freely available under the Creative Commons Attribution ShareAlike international license 3.0 (CCBYSA), with no guarantee and no support (at least for now). See source code for individual options and version history details.
Please feel free to send feedback, suggestions and bug reports. However, keep in mind that this is still a workinprogress, so incomplete features, errors and instabilities are to be expected.
 fMRISparse toolbox for Matlab / version 1.11 / release 25Sept2013 :
http://dx.doi.org/10.5281/zenodo.59612 >> Download (22 KB) (MD5: 7c84efb50d667b75a64976127e08a44f)
 FastICA toolbox for Matlab : Developed and published by the ICS team in the Helsinki University of Technology, Finland. It is used as external package by the fMRISparse toolbox, but any similar ICA implementation of FastICA or Infomax can be used instead after minor modifications in the main (template) scripts. An Octave implementation is also provided, see their main page for details & downloads.
 KSVD toolbox for Matlab : A joint work developed and published by Michael Elad, Michal Aharon and Freddy Bruckstein. It is used as external package by the fMRISparse toolbox, but any similar KSVD implementation can be used instead after minor modifications in the main (template) scripts. The specific tollbox is adopted specifically due to it simple implementation and maximum compatibility with Octave (no .mex compiled files are required, pure Matlab implementation, may be slow with large datasets). See their main page for details & more downloads.
 Kittipat fMRI preprocessing toolbox for Matlab : Currently not required by the current version of fMRISparse toolbox, but highly recommended, as it provides a very lightweight and compact alternative (e.g. compared to SPM8) for handling fMRI data files in NIFTI format. Instead of simulated data series, a real fMRI data file can be imported and used as input in the main (template) scripts, either for a single 'slice' (2D/3D) or a full brain scan (3D/4D).
 Openfmri.org : One of the most comprehensive and welldocumented collection of fMRI datasets, freely available for download, in NIFTI format.
All the documents and related material by Harris Georgiou are licensed, in parts and as a whole, under a Creative Commons AttributionNonCommercialShare Alike 3.0 Unported License. All the code sources and related material by Harris Georgiou are licensed, in parts and as a whole, under a EU Public License.
