Qi Long, Ph.D.

faculty photo
Professor of Biostatistics in Biostatistics and Epidemiology
Director, Biostatistics and Bioinformatics Core, Abramson Cancer Center
Senior Scholar, Center for Clinical Epidemiology and Biostatistics
Associate Director, Penn Institute for Biomedical Informatics
Director, Center for Cancer Data Science
Professor, Department of Computer and Information Science, School of Engineering and Applied Science
Senior Fellow, Penn Leonard Davis Institute
Professor, Department of Statistics and Data Science, The Wharton School
Department: Biostatistics and Epidemiology

Contact information
Department of Biostatistics, Epidemiology and Informatics
Perelman School of Medicine
University of Pennsylvania
201 Blockley Hall
423 Guardian Drive
Philadelphia, PA 19104
Office: 215-573-0659
Fax: 215-573-1050
B.S. (Biochemistry)
School of Gifted Young, University of Science and Technology of China, Hefei, Anhui, China, 1998.
M.S. (Biostatistics)
University of Michigan, Ann Arbor, MI, 2003.
Ph.D. (Biostatistics)
University of Michigan, Ann Arbor, MI, 2005.
Permanent link
> Perelman School of Medicine   > Faculty   > Details

Description of Research Expertise

Dr. Long's research purposefully includes novel statistical and machine learning research and impactful biomedical research, each of which reinforces the other. Its thrust is to develop robust statistical and machine learning methods for advancing intelligent and equitable health and medicine. Specifically, he has developed methods for analysis of big health data (-omics, EHRs, and mHealth data), predictive modeling, missing data, causal inference, data privacy, data and algorithmic fairness, Bayesian methods and clinical trials. Dr. Long’s methodological research has been supported by the National Institutes of Health, the Patient-Centered Outcomes Research Institute, and the National Science Foundation.

Dr. Long has directed the Statistical and Data Coordinating Center for national research networks and large-scale multi-site clinical studies—supervising a team of database administrators and programmers, application developers and statistical analysts. He currently co-directs (with Dr. Nicola Mason at Penn Vet) the Coordinating Center for the Premedical Cancer Immunotherapy Network for Canine Trials (PRECINCT), part of NCI’s Cancer Moonshot Initiative.

Dr. Long is the founding Director of the Center for Cancer Data Science, and Associate Director for Cancer Informatics of the Penn Institute for Biomedical Informatics. He also directs the Biostatistics and Bioinformatics Core in the Abramson Cancer Center at the University of Pennsylvania.

Dr. Long is an elected fellow of the American Association for the Advancement of Science (AAAS), elected fellow of the American Statistical Association (ASA), and elected member of the International Statistical Institute (ISI).

Selected Publications

Fang, C., He, H., Long, Q., Su, W.: Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training. Proceedings of the National Academy of Sciences (PNAS) 118(43): e2103091118, 2021.

Zhang Y., Long, Q. : Assessing Fairness in the Presence of Missing Data. 2021 Conference on Neural Information Processing Systems (NeurIPS 2021) 34: 16007-16019, 2021.

Chang C, Jang A, Manatunga A, Taylor A.T., Long, Q : A Bayesian Latent Class Model to Predict Kidney Obstruction Based on Renography and Expert Ratings in the Absence of Gold Standard. Journal of the American Statistical Association 115(532): 1645- 1663, 2020.

Chang, C., Deng, Y., Jiang, X., Long, Q.: Multiple Imputation for Analysis of Incomplete Data in Distributed Health Data Networks. Nature Communications 11(1): 5467, 2020.

Bu, Z., Dong, J., Long, Q., Su, W.: Deep Learning with Gaussian Differential Privacy. Harvard Data Science Review 2(3): 1-48, 2020.

Zheng, Q., Dong, J., Long, Q., Su, W.: Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion. Proceedings of the 37th International Conference on Machine Learning (ICML 2020) 119: 11420-11435, 2020.

Deng, Y., Jiang, X., Long, Q.: Privacy-Preserving Methods for Vertically Partitioned Incomplete Data. 2020 AMIA Annu Symp Proc Page: 348-357, 2020 Notes: This paper won the Distinguished Paper Award at the AMIA 2020 Annual Symposium.

Zhao, Y., Chang, C., and Long, Q.: Knowledge-guided statistical learning methods for analysis of high-dimensional -omics data in precision oncology. JCO Precision Oncology 3: 1-9, 2019.

Min EJ, Safo SE, Long Q: Penalized co-inertia analysis with applications to -omics data. Bioinformatics 35(6): 1018-1025, 2019 Notes: doi: 10.1093/bioinformatics/bty726.

Li Z, Roberts K, Jiang X, Long Q: Distributed Learning from Multiple EHR Databases: Contextual Embedding Models for Medical Events. Journal of Biomedical Informatics 92: 103138, 2019 Notes: doi: 10.1016/j.jbi.2019.103138. Epub 2019 Feb 27.

Zhao, Y.*, Chung, M., Johnson, B.A., Moreno, C.S., and Long, Q.: Hierarchical feature selection incorporating known and novel biological information: Identifying genomic features related to prostate cancer recurrence. Journal of the American Statistical Association 111(516): 1427-1439, 2016 Notes: *An earlier version won Yize Zhao the David P. Byar Travel Award from American Statistical Association’s Biometrics Section 2014.

Chang C, Kundu S, Long Q: Scalable Bayesian variable selection for structured high-dimensional data. Biometrics 74(4): 1372-1382, 2018 Notes: doi: 10.1111/biom.12882. Epub 2018 May 8.

Safo, S.E., Li, S., and Long, Q.: Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information. Biometrics 74(1): 300-312, 2018.

Long, Q., Xu, J., Osunkoya, A.O., Sannigrahi, S., Johnson, B.A., Zhou, W., Gillespie, T., Park, J.Y., Nam, R.K., Sugar, L., Stanimirovic, A., Seth, A.K., Petros, J.A., and Moreno, C.S.: Global transcriptome analysis of formalin-fixed prostate cancer specimens identifies biomarkers of disease recurrence. Cancer Research 74(12): 3228-3237, 2014.

Long, Q. and Johnson, B.A.: Variable selection in the presence of missing data: resampling and imputation. Biostatistics 16(3): 596-610, 2015.

Long, Q., Little, R.J., and Lin, X.: Causal inference in hybrid intervention trials involving treatment choice. Journal of the American Statistical Association 103(482): 474-484, 2008.

back to top
Last updated: 03/05/2024
The Trustees of the University of Pennsylvania