Ryan J. Urbanowicz, Ph.D.

faculty photo
Adjunct Assistant Professor of Biostatistics and Epidemiology
Department: Biostatistics and Epidemiology

Contact information
403 Blockley Hall
423 Guardian Drive
University of Pennsylvania
Philadelphia, PA 19104-6116
Office: 802-299-9461
Fax: 215-573-3111
Lab: 215-746-4225
B.S. (Biological and Environmental Engineering)
Cornell University, 2004.
M.Eng (Biological and Environmental Eng.)
Cornell University, 2005.
Ph.D. (Ph.D. in Genetics/Computational Biology)
Dartmouth College, 2012.
Permanent link
> Perelman School of Medicine   > Faculty   > Details

Description of Research Expertise

Common human disease research has evolved into a largely complex and interdisciplinary pursuit. Modern epidemiological challenges such as the characterization of complex systems, the management of ‘big data’, or the integration of data for systems biology epitomize this trend. The early stages of biomedical research typically focus on connecting predictive factors, whether they be genetic, epigenetic or environmental, to increased or decreased common disease susceptibility. This attempt to detect patterns of association is likely complicated by non-linear phenomena such as complex gene-gene interactions, gene-environment interactions, genetic heterogeneity, and phenocopy. My primary research interests focus on the development, evaluation, and application of novel computational, statistical, and visualization methods to facilitate classification and data mining in the complex, noisy domain of biomedical research.

My thesis research focused on the adaptation of a learning classifier system (LCS) algorithm to the task of detecting, modeling, and characterizing epistatic and heterogeneous associations within single nucleotide polymorphism (SNP) association studies. The development and application of LCS algorithms has since become a particular area of specialization. My post-doctoral work expanded upon this successful LCS groundwork leading to the development of ExSTraCS, an Extended Supervised Tracking and Classifying System. This work epitomizes my interest in (1) developing strategies which limit the number of assumptions made about the data, and instead allows the data to speak for itself for detecting complex or heterogeneous patterns, (2) allowing for the integration of data types by offering an algorithmic framework which functions for all combinations of discrete/continuous, attributes/endpoints, and (3) promoting a user friendly, interpretable environment for knowledge discovery. My work with LCS algorithms has also led me to pursue visual and statistical strategies with which to guide and facilitate knowledge discovery. My interests have also branched off into the theory and practice of complex disease model and data simulation, which led to the development of the open source GAMETES software package. Also, my interest in tackling issues related to ‘big data’ have motivated me to explore, expand and develop new feature selection approaches (e.g. ReliefF, SURF, SURF*, MultiSURF*, and MultiSURF), for computational and algorithmic flexibility and efficiency. These algorithms offer critical preprocessing steps for feature selection and the generation and application of statistical, objective, and unbiased expert knowledge to more efficiently guide stochastic algorithm learning.

In summary, my research interests lie at the intersection of genetics, genomics, biostatistics, epidemiology, machine learning, and computer science. I have adopted a quantitative biomedical research strategy that embraces, rather than ignores, the complexity of the relationship between predictive factors and disease endpoints.

Selected Publications

Olson, R.S., LaCava W., Orzechowski, P., Urbanowicz, R.J., Moore, J.H.: PMLB: A large benchmark suite for machine learning evaluation and comparison. BioData Mining December 2017.

Created Educational YouTube Video: Learning Classifier Systems in a Nutshell. https://www.youtube.com/watch?v=CRge_cZ2cJc 2017.

Urbanowicz, R.J., Browne, W.: Introduction to learning classifier systems. Springer, New York, NY. Springer, New York, NY, 2016 Notes: Available on amazon.com.

Olson, R.S, Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. Proceedings of the Genetic and Evolutionary Computing Conference. ACM Press, Page: 485-492, 2016 Notes: Highlight: Won a best paper award in the Evolutionary Machine Learning Track at GECCO’16.

Urbanowicz, R.J., Olson, R.S, Moore, J.H.: Pareto inspired multi-objective rule fitness for noise-adaptive rule-based machine learning. Springer Lecture Notes in Computer Science Page: 514-524, 2016.

Olson, R.S, Urbanowicz, R.J., Moore, J.H.: Automating biomedical data science through tree-based pipeline optimization. Springer Lecture Notes in Computer Science Page: 123-137, 2016 Notes: Highlight: Won a best paper award in the EvoBIO track.

Urbanowicz, R.J., Moore, J.H.: ExSTraCS 2.0: Description and evaluation of a scalable learning classifier system. Evolutionary Intelligence. 8(2-3): 89-116, 2015 Notes: Highlight: Solved the extremely complex 135-bit benchmark multiplexer problem directly for the first time reported in literature.

Urbanowicz, R.J., Moore, J.H.: Retooling fitness for noisy problems in a supervised Michigan-style learning classifier system. Proceedings of the Genetic and Evolutionary Computing Conference. ACM Press, Page: 591-598, 2015.

Urbanowicz, R.J., Ramanand, N., Moore, J.H.: Continuous endpoint data mining with ExSTraCS. Proceedings of the Genetic and Evolutionary Computing Conference. ACM Press, Page: 1029-1036, 2015.

Urbanowicz, R.J., Moore, J.H.: Learning classifier systems: The rise of genetics-based machine learning in biomedical data mining. In. Sarkar, N., (Eds.) Methods in Biomedical Informatics, 1st Edition, Elsevier. 2014.

back to top
Last updated: 02/25/2019
The Trustees of the University of Pennsylvania