Kai Wang, Ph.D.
Children's Hospital of Philadelphia
Philadelphia, PA 19104
B.S. (Biochemistry & Molecular Biology)
Peking University, 2000.
M.S. (Tumor Biology)
Mayo Clinic, 2002.
Ph.D. (Microbiology & Computational Biology)
University of Washington, 2005.
Description of Research ExpertiseThe research in our laboratory aims to develop novel genomics and bioinformatics methods to improve the diagnosis, treatment, and prognosis of rare diseases, to ultimately facilitate the implementation genomic medicine on scale. Our research can be divided into several areas.
First, we are developing analytical pipelines for whole genome and whole exome sequencing data, all the way from FASTQ/FAST5 files to biological insights. Some examples of computational tools used in the lab include SeqMule, ANNOVAR, Phenolyzer, SeqHBase, and InterVar. These approaches facilitate a better understanding of the functional content and clinical insights from sequencing data.
Furthermore, we are also developing genomic assays and methods to analyze long-read data, such as those generated from linked-read sequencing, optical mapping, PacBio, and Nanopore long-read sequencing. These methods help identify causal genetic variants on cases that failed to be diagnosed by traditional whole genome/exome sequencing approaches, and help map aberrant DNA modifications such as methylations in tissues from patients in comparison to controls. Some examples of computational tools developed by our lab include RepeatHMM, NextSV, LongSV, LinkedSV, NanoMod, and DeepMod.
Finally, we are developing data mining approaches from clinical phenotypic information in Electronic Health Records (EHR) to correlate genotype and phenotype together, and better understand the phenotypic heterogeneity of inherited diseases. Some examples of computational tools that we developed include EHR-Phenolyzer, SparkText, Doc2HPO, and Phen2Gene, which use natural language processing on clinical notes to predict possible genetic syndromes and candidate genes.
Selected PublicationsDoostparast Torshizi A, Armoskus C, Zhang H, Forrest MP, Zhang S, Souaiaia T, Evgrafov OV, Knowles JA, Duan J*, Wang K*: Deconvolution of Transcriptional Networks Identified TCF4 as a Master Regulator in Schizophrenia. Sci Adv 5(9): eaau4139, September 2019.
He MM, Li Q, Yan M, Cao H, Hu Y, He KY, Cao K, Li MM, Wang K: Variant Interpretation for Cancer (VIC): a computational tool for assessing clinical impacts of somatic variants. Genome Med 11(1): 53, August 2019 Notes: doi: 10.1186/s13073-019-0664-4.
Borgmann-Winter KE, Wang K, Bandyopadhyay S, Doostparast Torshizi A, Blair I, Hahn CY: The proteome and its dynamics: A missing piece for integrative multi-omics in schizophrenia. Schizophrenia Research in press(pii): S0920-9964(19)30303-2, August 2019 Notes: doi: 10.1016/j.schres.2019.07.025.
Liu, Fabricio K, Li Z, Ta C, Wang K*, Weng C*: Doc2Hpo: a web application for efficient and accurate HPO concept curation. Nucleic Acids Res 47(W1): W566-W570, July 2019 Notes: doi: 10.1093/nar/gkz386. *: Co-corresponding author.
Liu Q, Fang L, Yu G, Wang D, Xiao CL, Wang K: Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nat Commun 10(1): 2449, June 2019 Notes: doi: 10.1038/s41467-019-10168-2.
Xie G, Dong C, Kong Y, Zhong JF, Li M, Wang K : GDP: Group lasso regularized Deep learning for cancer Prognosis from multi-omics and clinical features. Genes 10(3): 240, March 2019 Notes: https://doi.org/10.3390/genes10030240.
Georgieva D, Liu Q, Wang K, Egli D: Detection of Base Analogs Incorporated During DNA Replication by Nanopore Sequencing. bioRxiv Feb 2019 Notes: doi: https://doi.org/10.1101/549220.
Fang L, Kao C, Gonzalez MV, Mafra FA, Pellegrino R, Li M, Wenzel S, Wimmer K, Hakonarson H, Wang K: LinkedSV for detection of mosaic structural variants from linked-read exome and genome sequencing data. Nat Commun 10(1): 5585, December 2019 Notes: DOI: 10.1038/s41467-019-13397-7.
Zeng S, Zhang MY, Wang XJ, Hu ZM, Li JC, Li N, Wang JL, Liang F, Yang Q, Liu Q, Fang L, Hao JW, Shi FD, Ding XB, Teng JF, Yin XM, Jiang H, Liao WP, Liu JY, Wang K*, Xia K*, Tang BS*: Long-read sequencing identified intronic repeat expansions in SAMD12 from Chinese pedigrees affected with familial cortical myoclonic tremor with epilepsy. J Med Genet 56(4): 265-270, April 2019 Notes: doi: 10.1136/jmedgenet-2018-105484; *: co-corresponding author.
Dai Y, Li P, Wang Z, Liang F, Yang F, Fang L, Huang Y, Huang S, Zhou J, Wang D, Cui L, Wang K: Single-molecule optical mapping enables quantitative measurement of D4Z4 repeats in facioscapulohumeral muscular dystrophy (FSHD). J Med Genet pii: jmedgenet-2019-106078, September 2019 Notes: doi: 10.1136/jmedgenet-2019-106078.
Liu Q, Georgieva DC, Egli D, Wang K: NanoMod: a computational tool to detect DNA modifications using Nanopore long-read sequencing data. BMC Genomics 20(Suppl 1): 78, Feb 2019.
He Z, Liu L, Wang K, Ionita-Laza I: A semi-supervised approach for predicting cell-type specific functional consequences of non-coding variation using MPRAs. Nat Commun 9(1): 5199, Dec 2018.
Hoon Son J, Xie G, Yuan C, Ena L, Li Z, Goldstein A, Huang L, Wang L, Shen F, Liu H, Mehl K, Groopman EE, Marasa M, Kiryluk K, Gharavi AG, Chung WK, Hripcsak G, Friedman C, Weng C*, Wang K*: Deep phenotyping on electronic health records facilitates genetic diagnosis by clinical exomes. Am J Hum Genet 103(1): 58-73, July 2018 Notes: doi: 10.1016/j.ajhg.2018.05.010; *: co-corresponding author.
Xiao CL, Zhu S, He M, Chen D, Zhang Q, Chen Y, Yu GL, Liu J, Xie SQ, Luo F, Liang Z, Wang DP, Bo XC, Gu X-F*, Wang K*, Yan GR*: N6-methyladenine DNA modification in human genome. Mol Cell 71(2): 306-318, July 2018 Notes: doi: 10.1016/j.molcel.2018.06.015; *: Co-corresponding author.
Doostparast Torshizi A, Wang K: Next Generation Sequencing in Drug Development: Target Identification and Genetically Stratified Clinical Trials. Drug Discovery Today 23(10): 1776-1783, Oct 2018 Notes: doi: 10.1016/j.drudis.2018.05.015.