EHR-Stats

EHR Publications

Example publications on methodology for the analysis of EHR data from this lab include:

  1. Yuan C, Linn KA, Hubbard RA. 2023. Algorithmic fairness of machine learning models for Alzheimer's Disease progression. JAMA Network Open. In press.
  2. Hubbard RA, Pujol TA, Alhajjar E, Edoh K, Martin ML. 2023. Identifying sources of disparities in surveillance mammography performance and personalized recommendations for supplemental breast imaging: A simulation study. Cancer, Epidemiology, Biomarkers & Prevention. In press.
  3. Vader DY, Mamtani R, Li Y, Griffith SD, Calip GS, Hubbard RA. 2023. Inverse probability of treatment weighting and missingness in confounder data in EHR-based analyses: a comparison of three missing data approaches using plasmode simulation. Epidemiology. 34(4): 520–530.
  4. Getz K, Hubbard RA, Linn K. 2023. Performance of multiple imputation using modern machine learning methods in electronic health records data. Epidemiology. 34(2):206-215.
  5. Su Y-R, Buist DSM, Lee JM, Ichikawa L, Miglioretti D, Bowles E, Wernli KJ, Kerlikowske K, Tosteson A, Lowry KP, Henderson L, Sprague B, Hubbard RA. 2023. Performance of statistical and machine learning risk prediction models for breast cancer surveillance benefits and failures. Cancer Epidemiology, Biomarkers & Prevention. 32(4):561-571.
  6. Harton J, Segal B, Mamtani R, Mitra N, Hubbard RA. 2023. Combining real-world and randomized control trial data using data-adaptive weighting via the on-trial score. Statistics in Biopharmaceutical Research15(2):408-420.
  7. Harton J, Mitra N, Hubbard RA. 2022. Informative presence bias in analyses of electronic health records-derived data: A cautionary note. Journal of the American Medical Informatics Association29(7):1191-9.
  8. Getz K, Mamtani R, Hubbard RA. 2021. Integrating real world data and clinical trial results using survival data reconstruction and marginal moment-balancing weights. Journal of Biopharmaceutical Statistics. 32(1):191-203.
  9. Hubbard RA, Lett E, Ho G, Chubak J. 2021. Characterizing bias due to differential exposure ascertainment in electronic health record data. Health Services and Outcomes Research Methodology. 21:309-323.
  10. Hubbard RA, Xu J, Chen Y, Siegel R, Eneli I. 2020. Studying pediatric health outcomes with electronic health records using Bayesian clustering and trajectory analysis. Journal of Biomedical Informatics. 113:103654.
  11. Harton J, Mamtani R, Mitra N, Hubbard RA. Bias reduction methods for propensity scores estimated from imperfect EHR-derived covariates. Health Services and Outcomes Research Methodology. 21(2):169-187.
  12. Hubbard RA, Tong J, Duan R, Chen Y. 2020. Reducing bias due to outcome misclassification for epidemiologic studies using EHR-derived probabilistic phenotypes. Epidemiology. 31(4):542-50.
  13. Tong J, Huang J, Chubak J, Wang X, Hubbard RA, Chen Y. 2020. An augmented estimation procedure for EHR-based association studies accounting for differential misclassification. Journal of the American Medical Informatics Association. 27(2):244-53.
  14. Hubbard RA, Huang J, Harton J, Oganisian A, Choi G, Utidjian L, Eneli I, Bailey LC, Chen Y. 2019. A Bayesian latent class approach for EHR-based phenotyping. Statistics in Medicine. 38(1):74-87.
  15. Chen Y, Wang J, Chubak J, Hubbard RA. 2019. Inflation of type I error rates due to differential misclassification in EHR-derived outcomes: Empirical illustration using breast cancer recurrence. Pharmacoepidemiology & Drug Safety. 28(2):264-268.
  16. Huang J, Duan R, Hubbard RA, Wu Y,  Moore JH, Xu H, Chen Y. 2018. PIE: A prior knowledge guided integrated likelihood estimation method for bias reduction in association studies using electronic health records data. Journal of the American Medical Informatics Association. 25(3):345-352.
  17. Hubbard RA, Johnson E, Chubak J, Wernli K, Kamineni A, Bogart T, Rutter CM. 2017. Accounting for misclassification in electronic health records-derived exposures using generalized linear finite mixture models. Health Services and Outcomes Research Methodology. 17(2):101-112. 
  18. Duan R, Cao M, Wu Y, Huang J, Denny J, Xu H, Chen Y. 2016. An empirical study for impacts of measurement errors on EHR based association studies. AMIA Annual Symposium Proceedings. 2016: 1764-1773. (This paper won first prize for "Best of Student Papers in Knowledge Discovery and Data Mining (KDDM)” at the AMIA 2016 meeting)
  19. Hubbard RA, Lange J, Zhang Y, Salim BA, Stroud JR, Inoue LYT. 2016. Using semi-Markov processes to study timeliness and tests used in the diagnostic evaluation of suspected breast cancer. Statistics in Medicine. 35(27): 4980-4993. 
  20. Lange JM, Hubbard RA, Inoue LYT, Minin VN. 2015. A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data. Biometrics. 71(1):90-101. 
  21. Cao M, Chen, Y, Zhu M, Zhang J. 2015. Automated evaluation of medical software usage: Algorithm and statistical analyses. Studies in Health Technology and Informatics. 216:965.
  22. Hubbard RA, Benjamin-Johnson R, Onega, T, Smith-Bindman R, Zhu W, Fenton JJ. 2015. Classification accuracy of Medicare claims-based methods for identifying providers failing to meet performance targets. Statistics in Medicine. 34(1):93-105.
  23. Hubbard RA, Chubak J, Rutter CM. 2014. Estimating screening test utilization using electronic health records data. eGEMs (Generating Evidence & Methods to improve patient outcomes). 2(1):14.

© The Trustees of the University of Pennsylvania | Site best viewed in a supported browser. | Report Accessibility Issues and Get Help | Privacy Policy | Site Design: PMACS Web Team.