Dr. Stoeckert directs the Computational Biology and Informatics Laboratory. The goal of our work is to help make sense of the enormous amount of biomedical data generated by high-throughput genomic approaches and synthesize them into something more than the sum of the parts. To that end, we are developing tools that enable researchers to mine and integrate data from a variety of different sources and types of experiments.
The first step in that process is the development of data warehouses that collect and store information in a useable fashion. In one such project, we have been working with David S. Roos, Ph.D., E. Otis Kendall Professor of Biology at Penn, and Jessica Kissinger, Ph.D., at University of Georgia, to develop a bioinformatics resource center for eukaryotic pathogens, funded by the National Institute of Allergy and Infectious Diseases. Within the resource center, we have built databases that serve research communities interested in specific pathogens. For example, PlasmoDB, houses information on the parasites that cause malaria.
To maximize the utility of data warehouses, we must have ways to represent and store data that enables researchers to make connections between experiments and between data from different types of experiments. Therefore, part of my group is involved in knowledge representation and developing ontologies, which standardizes data through the use of controlled vocabularies and relationships. Our goal is to provide the tools, including ontologies, to allow people to annotate their experiments or mark up their papers in a way that another researcher could efficiently search for and combine particular kinds of results from a variety of sources.
We work with a number of groups on ontology projects, including the Ontology for Biomedical Investigations Consortium which is a member of the Open Biological and Biomedical Ontologies (OBO) Foundry. I have also been involved in a number of standards projects over the years, and am currently on the board of the FGED society, which promotes data sharing and standardized representation of data, particularly from genomic experiments.
In addition to building systems that help other researchers maximize the value of their data, my team is involved in model building and network analysis with the aim of discovering new insights into biology. One area we focus on is type I diabetes. As a member of the Beta Cell Biology Consortium, we have established a data warehouse that houses datasets from consortium members. Additionally, our role has been to help the consortium integrate information from those datasets, as well as from key datasets from outside consortium, and to put those data into the context of beta cell development and diabetes.
For example, while many researchers look at the list of genes produced in a microarray experiment, we try to go beyond list making and use computational methods to uncover connections between genes. To do that, we are developing networks of genes based on expression data and information from a variety of other sources, including published information on known interactions and computational analyses that predict interactions between genes. Once we have that data, we can start to visualize interacting partners and show where and when they are important in beta cell function and development.
The approaches and tools we develop in one research arena can often be applied to another one. For instance, we are applying our data integration and analysis approaches to high-throughput sequencing data, including RNA-seq and ChIP-seq data.
Brochhausen M, Zheng J, Birtwell D, Williams H, Masci AM, Ellis HJ, Stoeckert CJ Jr.: OBIB-a novel ontology for biobanking. J Biomed Semantics 7: 23, May 2016.
Bandrowski A, Brinkman R, Brochhausen M, Brush MH, Bug B, Chibucos MC, Clancy K, Courtot M, Derom D, Dumontier M, Fan L, Fostel J, Fragoso G, Gibson F, Gonzalez-Beltran A, Haendel MA, He Y, Heiskanen M, Hernandez-Boussard T, Jensen M, Lin Y, Lister AL, Lord P, Malone J, Manduchi E, McGee M, Morrison N, Overton JA, Parkinson H, Peters B, Rocca-Serra P, Ruttenberg A, Sansone SA, Scheuermann RH, Schober D, Smith B, Soldatova LN, Stoeckert CJ Jr, Taylor CF, Torniai C, Turner JA, Vita R, Whetzel PL, Zheng J.: The Ontology for Biomedical Investigations. PLoS One 11(4): e0154556, April 2016.
Aurrecoechea C, Barreto A, Basenko EY, Brestelli J, Brunk BP, Cade S, Crouch K, Doherty R, Falke D, Fischer S, Gajria B, Harb OS, Heiges M, Hertz-Fowler C, Hu S, Iodice J, Kissinger JC, Lawrence C, Li W, Pinney DF, Pulman JA, Roos DS, Shanmugasundram A, Silva-Franco F, Steinbiss S, Stoeckert CJ Jr, Spruill D, Wang H, Warrenfeltz S, Zheng J.: EuPathDB: the eukaryotic pathogen genomics database resource. Nucleic Acids Res. 45: D581-D591, January 2017 Notes:
Clayton HW, Osipovich AB, Stancill JS, Schneider JD, Vianna PG, Shanks CM, Yuan W, Gu G, Manduchi E, Stoeckert CJ Jr, Magnuson MA.: Pancreatic Inflammation Redirects Acinar to β Cell Reprogramming. Cell Rep. 17(8): 2028-2041, November 2016.
Margaret E. McCormick, Elisabetta Manduchi, Walter R. T. Witschey, Robert C. Gorman, Joseph H. Gorman III, Yi-Zhou Jiang, Christian J. Stoeckert, Jr, Alex J. Barker, Michael Markl, Peter F. Davies: Integrated regional cardiac hemodynamic imaging and RNA sequencing reveal corresponding heterogeneity of ventricular wall shear stress and endocardial transcriptome. Journal of the American Heart Association 2016 Notes: In press.
Jiang YZ, Manduchi E, Stoeckert CJ Jr, Davies PF.: Arterial endothelial methylome: differential DNA methylation in athero-susceptible disturbed flow regions in vivo. BMC Genomics 16: 506, July 2015.
Gutierrez JB, Harb OS, Zheng J, Tisch DJ, Charlebois ED, Stoeckert CJ Jr, Sullivan SA.: A Framework for Global Collaborative Data Management for Malaria Research. Am J Trop Med Hyg. 93: 124-32, September 2015.
Osipovich AB, Long Q, Manduchi E, Gangula R, Hipkens SB, Schneider J, Okubo T, Stoeckert CJ Jr, Takada S, Magnuson MA: Insm1 promotes endocrine cell differentiation by modulating the expression of a network of genes that includes Neurog3 and Ripply3. Development 141(15): 2939-49, August 2014.
Dugan VG, Emrich SJ, Giraldo-Calderón GI, Harb OS, Newman RM, Pickett BE, Schriml LM, Stockwell TB, Stoeckert CJ Jr, Sullivan DE, Singh I, Ward DV, Yao A, Zheng J, Barrett T, Birren B, Brinkac L, Bruno VM, Caler E, Chapman S, Collins FH, Cuomo CA, Di Francesco V, Durkin S, Eppinger M, Feldgarden M, Fraser C, Fricke WF, Giovanni M, Henn MR, Hine E, Hotopp JD, Karsch-Mizrachi I, Kissinger JC, Lee EM, Mathur P, Mongodin EF, Murphy CI, Myers G, Neafsey DE, Nelson KE, Nierman WC, Puzak J, Rasko D, Roos DS, Sadzewicz L, Silva JC, Sobral B, Squires RB, Stevens RL, Tallon L, Tettelin H, Wentworth D, White O, Will R, Wortman J, Zhang Y, Scheuermann RH.: Standardized metadata for human pathogen/vector genomic sequences. PLoS One 96: e99979, June 2014.
Sarntivijai S, Lin Y, Xiang Z, Meehan TF, Diehl AD, Vempati UD, Schürer SC, Pang C, Malone J, Parkinson H, Liu Y, Takatsuki T, Saijo K, Masuya H, Nakamura Y, Brush MH, Haendel MA, Zheng J, Stoeckert CJ, Peters B, Mungall CJ, Carey TE, States DJ, Athey BD, He Y.: CLO: The cell line ontology. J Biomed Semantics 5: 37, August 2014.
back to top
Last updated: 12/21/2017
The Trustees of the University of Pennsylvania