Automatic Spectroscopic data Catergorization by cLustering Analysis (ASCLAN)

Automatic Spectroscopic data Categorization by cLustering ANalysis (ASCLAN) is a data-driven approach for distinguishing discriminatory variables for phenotypic subclasses.

The encrypted ASCLAN matlab code can be freely downloaded for use in data analysis. However, the use and re-distribution of the code, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of original publication. We ask that users who use the ASCLAN approach to cite the ASCLAN paper as well as the following papers in any resulting publications.

Publication/patent number:

  • Cloarec O, Dumas ME, Craig A, Barton RH, Trygg J, Hudson J, Blancher C, Gauguier D, Lindon JC, Holmes E, Nicholson J. Anal Chem. 2005 Mar 1;77(5):1282-9. Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets.
  • Crockford DJ, Holmes E, Lindon JC, Plumb RS, Zirah S, Bruce SJ, Rainville P, Stumpf CL, Nicholson JK. Anal Chem. 2006 Jan 15;78(2):363-71. Statistical heterospectroscopy, an approach to the integrated analysis of NMR and UPLC-MS data sets: application in metabonomic toxicology studies.
  • STOCSY patent number: US 20070043518, US7373256 and US7835872.

Description and Use

The current version support datasets with two biological groups within the datasets.

Use of function: [sig_idx, nondis_idx] = ASCLAN(data1, data2, var, plotOpt)

Input parameters:

"data1" and "data2" are groups of spectra from two biological classes (spectra are in row vectors)
"var" is the spectra variables, e.g., ppm
"plotOpt" is the plot option,"1" plot the color coded OPLSDA loading (for NMR only)

In the ASCLAN parameter file ("OPLSDA para.txt") you may change the parameters for constructing the OPLS-DA. The explanation for the parameters are:

nrcv - number of folds in cross validation
nc - number of correlated compounds in x
ncox - number of y-orthogonal compounds in x
ncoy - number of x-orthogonal compounds in y
np - number of iterations in permutation test
preprocessing - "mc" for mean centering, "uv" for unit variance scaling, "pa" for pareto scaling

Interpretation of the output

Output:

"sig_idx" the index of discriminatory variables.
"nondis_idx" the index of nondiscriminatory variables.
"model" contains the OPLSDA resutls, e.g., the Q2 and the corresponding p-value based on the permutation test can be extracted from model.Q2Yhatcum and model.P_Q2.

The color coded loading plot shows the discriminatory variables, nondiscrimiantory signal variables and the noise variables, which are indicated by red, blue and grey, respectively.

The interpretation of the red and green dots above the loading corresponds to:
red dot - upward loading peak - up-regulate in “data2” compared to “data1”.
green dot - downward loading peak - down-regulate in “data2” compared to “data1”

Download Instructions

The ASCLAN encrypted Matlab script and the parameter file may be downloaded below. To ensure they are downloaded correctly right-click on the links and select 'Save link as..' or 'Save target as..' depending on your browser and then save to a location on your computer. The script and parameter file can then be loaded into Matlab.

shocsyshocsy parameter

You can also download a demonstration data set that has been created. This will give you an idea on how to set up the dataset for running the ASCLAN script. To download follow the instructions above for the Matlab script and parameter file. Note that the data file is 38MB in size if downloading over a slow internet connection.

dummy data

Within the Matlab environment, type the following commands to load the data:
data_cluster{1} = xlsread('demo_data', 1);
data_cluster{2} = xlsread('demo_data', 2);
ppm = xlsread('demo_data', 3);

To run the ASCLAN code, type:
plotOpt = 1;
[discri_idx, nonDiscri_idx, OPLSDA_model] = ASCLAN(data_cluster{1}',data_cluster{2}', ppm, plotOpt);

You may want to output the discriminatory and non-discriminatory variables instead of the index by typing:
discri_ppm = ppm(discri_idx);
nonDiscri_ppm = ppm(nonDiscri_idx);

Tel: +44 (0) 1634 202935
Fax: +44 (0) 1634 883927

Copyright © Medway School of Pharmacy
Last Updated 09/05/2016