Qi Long, Ph.D.

faculty photo
Professor of Biostatistics in Biostatistics and Epidemiology
Department: Biostatistics and Epidemiology

Contact information
Department of Biostatistics, Epidemiology and Informatics
Perelman School of Medicine
University of Pennsylvania
201 Blockley Hall
423 Guardian Drive
Philadelphia, PA 19104
Office: 215-573-0659
Fax: 215-573-1050
B.S. (Biochemistry)
School of Gifted Young, University of Science and Technology of China, Hefei, Anhui, China, 1998.
M.S. (Biostatistics)
University of Michigan, Ann Arbor, MI, 2003.
Ph.D. (Biostatistics)
University of Michigan, Ann Arbor, MI, 2005.
Permanent link
> Perelman School of Medicine   > Faculty   > Details

Description of Research Expertise

Dr. Long's research purposefully includes novel statistical and machine learning research and impactful biomedical research, each of which reinforces the other. Its thrust is to develop statistical and machine learning methods for advancing precision medicine and population health. Specifically, he has developed methods for analysis of big health data (-omics, EHRs, and mHealth data), predictive modeling, missing data, causal inference, data privacy and algorithmic fairness, Bayesian methods and clinical trials. He also has collaborated extensively in biomedical research areas such as cancer, cardiovascular diseases, neurological disorders and neurodegeneration, diabetes, kidney diseases, and mental health.

Dr. Long’s methodological research has been supported by the National Institutes of Health, the Patient-Centered Outcomes Research Institute, and the National Science Foundation.

Dr. Long is the founding Director of the Center for Cancer Data Science, and Associate Director for Cancer Informatics of the Penn Institute for Biomedical Informatics. He also directs the Biostatistics and Bioinformatics Core in the Abramson Cancer Center at the University of Pennsylvania.

Dr. Long is an elected fellow of the American Statistical Association and an elected member of the International Statistical Institute.

Selected Publications

Chang, C., Deng, Y., Jiang, X. and Long, Q.: Multiple Imputation for Analysis of Incomplete Data in Distributed Health Data Networks. Nature Communications 11(1): 5467, 2020.

Chang C, Jang A, Manatunga A, Taylor A.T., Long, Q : A Bayesian Latent Class Model to Predict Kidney Obstruction Based on Renography and Expert Ratings in the Absence of Gold Standard. Journal of the American Statistical Association Page: in press, 2020 Notes: doi.org/10.1080/01621459.2019.1689983.

Bu, Z., Dong, J., Long, Q. and Su, W.: Deep Learning with Gaussian Differential Privacy. Harvard Data Science Review 2020.

Zheng, Q., Dong, J., Long, Q., and Su, W.: Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion. The 37th International Conference on Machine Learning (ICML 2020) 2020.

Deng, Y., Jiang, X., and Long, Q.: Privacy-Preserving Methods for Vertically Partitioned Incomplete Data. AMIA'20: AMIA 2020 Annual Symposium 2020.

Min, E.J.*, Safo, S.E., and Long, Q.: Penalized Co-Inertia Analysis with Applications to -Omics Data. Bioinformatics 35(6): 1018-1025, 2018 Notes: *mentee.

Li Z, Roberts K, Jiang X, Long Q: Distributed Learning from Multiple EHR Databases: Contextual Embedding Models for Medical Events. Journal of Biomedical Informatics 92: 103138, 2019 Notes: doi: 10.1016/j.jbi.2019.103138. Epub 2019 Feb 27.

Zhao, Y., Chang, C., and Long, Q.*: Knowledge-guided statistical learning methods for analysis of high-dimensional -omics data in precision oncology. JCO Precision Oncology 3: 1-9, 2019 Notes: *Corresponding author.

Leng Q, Tarbe M, Long Q, Wang F: Pre-existing heterologous T-cell immunity and neoantigen immunogenicity. Clinical & Translational Immunology 9(3): 301111, 2020 Notes: doi: 10.1002/cti2.1111. eCollection 2020. Review.

Safo, S.E.*, Li, S., and Long, Q.: Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information. Biometrics 74(1): 300-312, 2018 Notes: *mentee.

Chang C, Kundu S, Long Q: Scalable Bayesian variable selection for structured high-dimensional data. Biometrics 74(4): 1372-1382, 2018 Notes: doi: 10.1111/biom.12882. Epub 2018 May 8.

Long, Q., Little, R.J., and Lin, X.: Causal inference in hybrid intervention trials involving treatment choice. Journal of the American Statistical Association 103(482): 474-484, 2008.

Long, Q., Xu, J., Osunkoya, A.O., Sannigrahi, S., Johnson, B.A., Zhou, W., Gillespie, T., Park, J.Y., Nam, R.K., Sugar, L., Stanimirovic, A., Seth, A.K., Petros, J.A., and Moreno, C.S.: Global transcriptome analysis of formalin-fixed prostate cancer specimens identifies biomarkers of disease recurrence. Cancer Research 74(12): 3228-3237, 2014.

Long, Q. and Johnson, B.A.: Variable selection in the presence of missing data: resampling and imputation. Biostatistics 16(3): 596-610, 2015.

Zhao, Y*. and Long, Q.: Multiple imputation in the presence of high-dimensional data. Statistical Methods in Medical Research 25(5): 2021-2035, Oct 2016 Notes: * mentee.

Zhao, Y.*, Chung, M., Johnson, B.A., Moreno, C.S., and Long, Q.: Hierarchical feature selection incorporating known and novel biological information: Identifying genomic features related to prostate cancer recurrence. Journal of the American Statistical Association 111(516): 1427-1439, 2016 Notes: * mentee (An earlier version won Yize Zhao the David P. Byar Travel Award from American Statistical Association’s Biometrics Section 2014).

back to top
Last updated: 11/25/2020
The Trustees of the University of Pennsylvania