CHIMERA: Clustering of Heterogeneous Disease Effects via Distribution Matching of Imaging Patterns
Many brain disorders and diseases exhibit heterogeneous symptoms and imaging characteristics, as shown in figure (A). This heterogeneity is typically not captured by commonly adopted neuroimaging analyses that seek only a main imaging pattern when two groups need to be differentiated (e.g., patients and controls, or clinical progressors and non-progressors). On the other hand, standard data-driven clustering methods may group patients according to the largest data variability, which are not induced by the disease. This proposed probabilistic clustering approach, CHIMERA, as illustrated in figure (B), models the pathological process by a combination of multiple regularized transformations from normal control population to the patient population, while controlling the similarity in covariates (e.g. age, gender, height). Therefore, it seeks to identify multiple imaging patterns that relate to disease effects and to better characterize disease heterogeneity.
CHIMERA software is freely available under a BSD-style open source license that is compatible with the Open Source Definition by The Open Source Initiative and contains no restrictions on use of the software. The full license text is included with the distribution package and available online.
To Download: Please visit our NITRC Page for CHIMERA.
This software performs clustering of heterogenous disease patterns within patient group. The clustering is based on imaging features, covariate features and dataset information.
2. TESTING & INSTALLATION
We provided a test sample in test folder. Simply run the following command:
This runs a test script which may take a few minutes. The test case contains a synthetic 20 dimensional data. Data file is named test_data.csv. The imaging features have a nonlinear correlation with covariate 1, and no correlation with covariate 2. Test cases are used to check if script "chimera" can run correctly, and if "chimera_test can work as expected. The expected adjusted rand index should be larger than 0.9 for this sample test.
I. Running "chimera":
Here is a brief introduction of running CHIMERA. For a complete list of parameters, see --help option.
To run this software you will need an input csv file, with the following mandatory fields:
(a) Subject group label (binary), header "Group". 1: patient; 0: normal control.
(b) At least one imaging feature, header "IMG".
For a csv file data.csv that looks like below:
ID, COVAR,COVAR, IMG, IMG, ..., Group, Set ADNI_0001, 15.1, 0.454, 0.212, 0.13,....,0, 1 ADNI_0002, 20.9, 0.121, 0.343, 1.32,..., 0, 2 ADNI_0003, 21.2, 0.141, 0.143, 0.21,..., 1, 2 ... ...
If you install the package successfully, there will be two ways of running CHIMERA:
1. Running the standalone script (recommended):
chimera -i data.csv -r output.txt -k 3 -o model.cpkl -m 20 -v
2. Running as a package, a simple example:
import CHIMERA CHIMERA.run(dataFile, outFile, numClusters, verbose=True)
The software returns:
1. clustering labels in output.txt
2. transformation model in model.cpkl (cPickle binary mode)
** For the best performance, sample size should be large enough (100+) and parameters have to be cross validated.
II. Running "chimera_test":
Here is a brief introduction of running CHIMERA_TEST, which generates clustering labels for test samples. The input will be a csv file of test data, the model file produced by running CHIMERA. Please make sure the data fields in csv file are matched with the one used in training phase.
1. Running the standalone script:
chimera_test -i data.csv -r output.txt -m model.cpkl
2. Running as a python package:
import CHIMERA CHIMERA.test("data.csv", "output.txt", "model.cpkl")
The software returns clustering labels in output.txt.
For the software license please visit LICENSE
Version 1.2.2 (August 2017)
- Release on NITRC
Version 1.1.0 (September, 2016)
- Repackage using standard python setup tools.
Version 1.0.0 (August, 2016)
- First public release of the CHIMERA software.
- Aoyan Dong
- Developed the algorithm, implemented the software.
- Nicolas Honnorat
- :Developed the algorithm
Please cite [TMI2016] when you used CHIMERA in your research:
[TMI2016]: A. Dong, N. Honnorat, B. Gaonkar and C. Davatzikos, "CHIMERA: Clustering of Heterogeneous Disease Effects via Distribution Matching of Imaging Patterns," in IEEE Transactions on Medical Imaging, vol. 35, no. 2, pp. 612-621, Feb. 2016.