Multiple Imputation for High-dimensional Incomplete Data
MIHD: R package for multiple imputation for high-dimensional incomplete data.
- Y. Zhao and Q. Long, “Multiple imputation in the presence of high-dimensional data,” Statistical methods in medical research, p. 962280213511027, 2013.
- Y. Deng, C. Chang, M. S. Ido, and Q. Long, “Multiple imputation for general missing data patterns in the presence of high-dimensional data,” Scientific reports, vol. 6, iss. 21689, 2016.
Bootstrap Imputation with Variable Selection
BISS: R package for implementing boostrap imputation with variable selection.
Reference: Q. Long and B. A. Johnson, “Variable selection in the presence of missing data: resampling and imputation,” Biostatistics, vol. 16, iss. 3, pp. 596-610, 2015.
Package: BISSpkg 1.0
Knowledge-guided Sparse PCA
fgsPCA: matlab code to perform structured sparse PCA
Reference: Z. Li, S. Safo, and Q. Long, "Incorporating Biological Information in Sparse Principal Component Analysis with Application to Genomic Data", BMC bioinformatics 18.1 (2017): 332.
Matlab Code: fgsPCA
Scalable Bayesian Variable Selection for Structured High-dimensional Data
EMSHS: R code to perform an EM alrogithm for Bayesian shrinkage approach with the structural information incorporated
Reference: Chang, C., Kundu, S., & Long, Q. (2018). Scalable Bayesian variable selection for structured high‐dimensional data. Biometrics. (https://onlinelibrary.wiley.com/doi/full/10.1111/biom.12882)
Package: EMSHS R Package in CRAN
Sparse Linear Discriminant Analysis in Structured Covariates Space
sSLDA: matlab code to perform structured sparse LDA
Reference: Safo, S.E., and Long, Q. (2016) Sparse linear discriminant analysis in structured covariates space. Statistical Analysis and Data Mining: The ASA Data Science Journal, 12(2), pp.56-69.
Matlab Code: sSLDA
Structured Sparse CCA
sSCCA: matlab code to perform structured sparse CCA
Reference: S. Safo, S. Li and Q. Long, "Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information", Biometrics 74.1 (2018): 300-312.
Matlab Code: sSCCA_v2
Penalized Co-Inertia Analysis
pCIA: R package for implementing penalized co-inertia analysis for two datasets.
Reference: E. Min, S. Safo, and Q. Long, “Penalized Co-Inertia Analysis with Applications to –Omics Data”, Bioinformatics, 2019, 35(6):1018-25.
Distributed Learning from Multiple EHR Databases
Distributed Learning Predictor: Python library for learning from multiple databases and building predictive models based on Distributed Noise Contrastive Estimation (Distributed NCE)
Reference: Li, Z., Roberts, K.E., Jiang, X., and Long, Q. Distributed Learning from Multiple EHR Databases: Contextual Embedding Models for Medical Events. Journal of Biomedical Informatics, 2019, 92, p.103138.
Link for the software on github: https://github.com/ziyili20/DistributedLearningPredictor
Sparse Multiple Co-Inertia Analysis
pmCIA: R package to perform the sparse multiple co-inertia analysis for multiple datasets
Reference: Min, E.J. and Long, Q., 2020. Sparse multiple co-Inertia analysis with application to integrative analysis of multi-Omics data. BMC Bioinformatics, 21, pp.1-12.
Graph-guided Bayesian SVM
Graph-guided Bayesian SVM: Matlab codes for the graph-guided Bayesian SVM
Reference: Wenli Sun, Changgee Chang, and Qi Long, "Graph-guided Bayesian SVM with Adaptive Structured Shrinkage Prior for high-dimensional data"
Distribute Multiple Imputation
Distributed Multiple Imputation: R codes for the simulations reported in the paper
Reference: Changgee Chang, Yi Deng, Xiaoqian Jiang, and Qi Long. (2020) "Multiple Imputation for Analysis of Incomplete Data in Distributed Health Data Networks" Nature Communications, 11(1):5467.
R Codes: Distributed-MI.zip
Deep Learning with Gaussian Differential Privacy
Deep Learning with Gaussian Differential Privacy: Python codes
Reference: Bu, Z., Dong, J., Long, Q., and Su, W. (2020) Deep Learning with Gaussian Differential Privacy. Harvard Data Science Review, 2(3):1-48.
Python Library for TensorFlow Privacy including Gaussian DP: https://github.com/tensorflow/privacy
Bayesian Graphical Models of Single-Cell RNA-Sequencing Data
Accounting for Technical Noise in Bayesian Graphical Models of Single-Cell RNA-Sequencing Data: Python codes
Reference: Oh, J., Chang, C. and Long, Q. (2021) Accounting for Technical Noise in Bayesian Graphical Models of Single-Cell RNA-Sequencing Data. Biostatistics, in press