Dr. Stoeckert directs the Computational Biology and Informatics Laboratory. The goal of our work is to help make sense of the enormous amount of biomedical data generated by high-throughput genomic approaches and synthesize them into something more than the sum of the parts. To that end, we are developing tools that enable researchers to mine and integrate data from a variety of different sources and types of experiments.
The first step in that process is the development of data warehouses that collect and store information in a useable fashion. In one such project, we have been working with David S. Roos, Ph.D., E. Otis Kendall Professor of Biology at Penn, and Jessica Kissinger, Ph.D., at University of Georgia, to develop a bioinformatics resource center for eukaryotic pathogens, funded by the National Institute of Allergy and Infectious Diseases. Within the resource center, we have built databases that serve research communities interested in specific pathogens. For example, PlasmoDB, houses information on the parasites that cause malaria.
To maximize the utility of data warehouses, we must have ways to represent and store data that enables researchers to make connections between experiments and between data from different types of experiments. Therefore, part of my group is involved in knowledge representation and developing ontologies, which standardizes data through the use of controlled vocabularies and relationships. Our goal is to provide the tools, including ontologies, to allow people to annotate their experiments or mark up their papers in a way that another researcher could efficiently search for and combine particular kinds of results from a variety of sources.
We work with a number of groups on ontology projects, including the Ontology for Biomedical Investigations Consortium which is a member of the Open Biological and Biomedical Ontologies (OBO) Foundry. I have also been involved in a number of standards projects over the years, and am currently on the board of the FGED society, which promotes data sharing and standardized representation of data, particularly from genomic experiments.
In addition to building systems that help other researchers maximize the value of their data, my team is involved in model building and network analysis with the aim of discovering new insights into biology. One area we focus on is type I diabetes. As a member of the Beta Cell Biology Consortium, we have established a data warehouse that houses datasets from consortium members. Additionally, our role has been to help the consortium integrate information from those datasets, as well as from key datasets from outside consortium, and to put those data into the context of beta cell development and diabetes.
For example, while many researchers look at the list of genes produced in a microarray experiment, we try to go beyond list making and use computational methods to uncover connections between genes. To do that, we are developing networks of genes based on expression data and information from a variety of other sources, including published information on known interactions and computational analyses that predict interactions between genes. Once we have that data, we can start to visualize interacting partners and show where and when they are important in beta cell function and development.
The approaches and tools we develop in one research arena can often be applied to another one. For instance, we are applying our data integration and analysis approaches to high-throughput sequencing data, including RNA-seq and ChIP-seq data.
Jiang YZ, Manduchi E, Stoeckert CJ Jr, Davies PF.: Arterial endothelial methylome: differential DNA methylation in athero-susceptible disturbed flow regions in vivo. BMC Genomics 16: 506, July 2015.
Gutierrez JB, Harb OS, Zheng J, Tisch DJ, Charlebois ED, Stoeckert CJ Jr, Sullivan SA.: A Framework for Global Collaborative Data Management for Malaria Research. Am J Trop Med Hyg. 93: 124-32, September 2015.
Osipovich AB, Long Q, Manduchi E, Gangula R, Hipkens SB, Schneider J, Okubo T, Stoeckert CJ Jr, Takada S, Magnuson MA: Insm1 promotes endocrine cell differentiation by modulating the expression of a network of genes that includes Neurog3 and Ripply3. Development 141(15): 2939-49, August 2014.
Dugan VG, Emrich SJ, Giraldo-Calderón GI, Harb OS, Newman RM, Pickett BE, Schriml LM, Stockwell TB, Stoeckert CJ Jr, Sullivan DE, Singh I, Ward DV, Yao A, Zheng J, Barrett T, Birren B, Brinkac L, Bruno VM, Caler E, Chapman S, Collins FH, Cuomo CA, Di Francesco V, Durkin S, Eppinger M, Feldgarden M, Fraser C, Fricke WF, Giovanni M, Henn MR, Hine E, Hotopp JD, Karsch-Mizrachi I, Kissinger JC, Lee EM, Mathur P, Mongodin EF, Murphy CI, Myers G, Neafsey DE, Nelson KE, Nierman WC, Puzak J, Rasko D, Roos DS, Sadzewicz L, Silva JC, Sobral B, Squires RB, Stevens RL, Tallon L, Tettelin H, Wentworth D, White O, Will R, Wortman J, Zhang Y, Scheuermann RH.: Standardized metadata for human pathogen/vector genomic sequences. PLoS One 96: e99979, June 2014.
Sarntivijai S, Lin Y, Xiang Z, Meehan TF, Diehl AD, Vempati UD, Schürer SC, Pang C, Malone J, Parkinson H, Liu Y, Takatsuki T, Saijo K, Masuya H, Nakamura Y, Brush MH, Haendel MA, Zheng J, Stoeckert CJ, Peters B, Mungall CJ, Carey TE, States DJ, Athey BD, He Y.: CLO: The cell line ontology. J Biomed Semantics 5: 37, August 2014.
Davies PF, Manduchi E, Stoeckert CJ, Jiménez JM, Jiang YZ: Emerging topic: flow-related epigenetic regulation of endothelial phenotype through DNA methylation. Vascul Pharmacol. 62(2): 88-93, August 2014.
Zheng J, Xiang Z, Stoeckert CJ Jr, He Y.: Ontodog: a web-based ontology community view generation tool. Bioinformatics 30(9): 1340-1342, Feb 2014.
Greenfest-Allen E, Malik J, Palis J, Stoeckert CJ Jr.: Stat and interferon genes identified by network analysis differentially regulate primitive and definitive erythropoiesis. BMC Syst Biol. 7: 38, May 2013.
Kingsley PD, Greenfest-Allen E, Frame JM, Bushnell TP, Malik J, McGrath KE, Stoeckert CJ, Palis J.: Ontogeny of erythroid gene expression. Blood 121(6): e5-e13, Feb 2013.
Aurrecoechea C, Barreto A, Brestelli J, Brunk BP, Cade S, Doherty R, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M, Hu S, Iodice J, Kissinger JC, Kraemer ET, Li W, Pinney DF, Pitts B, Roos DS, Srinivasamoorthy G, Stoeckert CJ Jr, Wang H, Warrenfeltz S.: EuPathDB: The Eukaryotic Pathogen database. Nucleic Acids Res. 41(D1): D684-91, January 2013.
back to top
Last updated: 04/18/2016
The Trustees of the University of Pennsylvania