Faculty

Kai Wang, Ph.D.

faculty photo
Professor of Pathology and Laboratory Medicine
Department: Pathology and Laboratory Medicine

Contact information
3501 Civic Center Blvd, CTRB 6004
Children's Hospital of Philadelphia
Philadelphia, PA 19104
Office: 2674259573
Fax: 2155903660
Education:
B.S. (Biochemistry & Molecular Biology)
Peking University, 2000.
M.S. (Tumor Biology)
Mayo Clinic, 2002.
Ph.D. (Microbiology & Computational Biology)
University of Washington, 2005.
Permanent link
 

Description of Research Expertise

The research in our laboratory aims to develop novel genomics and bioinformatics methods to improve the diagnosis, treatment, and prognosis of rare diseases, to ultimately facilitate the implementation genomic medicine on scale. A detailed description of our research and rotation projects can be found on our lab website (https://wglab.org). In summary, our research can be divided into several areas.

First, we are developing analytical pipelines for whole genome and whole exome sequencing data, all the way from FASTQ/FAST5/POD5 files to biological insights. Some examples of computational tools include ANNOVAR, InterVar, CancerVar, Phenolyzer, Phen2Gene and PhenoSV. These approaches enhance the interpretation of sequencing data by uncovering functional content and providing clinically relevant insights.

Furthermore, we are developing genomic assays and methods to analyze long-read data, such as those generated from PacBio and Oxford Nanopore sequencing. These methods aid in identifying causal genetic variants in cases that elude diagnosis by traditional whole genome or exome sequencing and enable the detection of aberrant DNA and RNA methylation patterns. Some examples of computational tools include RepeatHMM, LinkedSV, ContextSV, NanoRepeat, LIQA, DeepMod and DeepMod2.

Finally, we are developing Artificial Intelligence (AI) and Machine Learning (ML) approaches to correlate genotype with phenotype, and to better understand the phenotypic heterogeneity of inherited diseases. We believe that multimodal AI holds the potential to transform our understanding of biology and medicine—what remains is to develop the right algorithms to fully harness its power. Some examples of computational tools include EHR-Phenolyzer, PhenoGPT, MutFormer and GestaltMML.

Selected Publications

Ahsan MU, Gouru A, Chan J, Zhou W, Wang K.: A signal processing and deep learning framework for methylation detection using Oxford Nanopore sequencing. Nat Commun 15: 1448, Feb 2024.

Gracia-Diaz C, Perdomo JE, Khan ME, Roule T, Disanza BL, Cajka GG, Lei S, Gagne AL, Maguire JA, Shalem O, Bhoj EJ, Ahrens-Nicklas RC, French DL, Goldberg EM, Wang K, Glessner JT, Akizu N.: KOLF2.1J iPSCs carry CNVs associated with neurodevelopmental disorders. Cell Stem Cell 31: 288-289, Mar 2024.

Jiang TT, Fang L, Wang K.: Deciphering "the language of nature": A transformer-based language model for deleterious mutations in proteins. Innovation (Camb) 4: 100487, Jul 2023.

Yang J, Liu C, Deng W, Wu D, Weng C, Zhou Y, Wang K.: Enhancing phenotype recognition in clinical notes using large language models: PhenoBCBERT and PhenoGPT. Patterns (N Y) 5: 100887, Dec 2023.

Xu Z, Li Q, Marchionni L, Wang K.: PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants. Nat Commun 14: 7805, Nov 2023.

Fang L, Monteys AM, Dürr A, Keiser M, Cheng C, Harapanahalli A, Gonzalez-Alegre P, Davidson BL, Wang K.: Haplotyping SNPs for allele-specific gene editing of the expanded huntingtin allele using long-read sequencing. HGG Adv 4: 100146, Sep 2022.

Fang L, Liu Q, Monteys AM, Gonzalez-Alegre P, Davidson BL, Wang K: DeepRepeat: direct quantification of short tandem repeats on signal data from nanopore sequencing. Genome Biol 23(1): 108, April 2022 Notes: DOI: 10.1186/s13059-022-02670-6.

Ahsan U, Liu Q, Fang L, Wang K: NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks. Genome Biol 22(1): 261, Sep 2021 Notes: DOI: 10.1186/s13059-021-02472-2.

Havrilla JM, Liu C, Dong X, Weng C, Wang K: PhenCards: a data resource linking human phenotype information to biomedical knowledge. Genome Med 13(1): 91, May 2021 Notes: DOI: 10.1186/s13073-021-00909-8.

Hu Y, Fang L, Chen X, Zhong JF, Li M, Wang K: LIQA: Long-read Isoform Quantification and Analysis. Genome Biol 22: 182, June 2021 Notes: DOI: 10.1186/s13059-021-02399-8.

Georgieva D, Liu Q, Wang K*, Egli D*: Detection of Base Analogs Incorporated During DNA Replication by Nanopore Sequencing. Nucleic Acids Res 48(15): e88, September 2020 Notes: DOI: 10.1093/nar/gkaa517.

Doostparast Torshizi A, Armoskus C, Zhang H, Forrest MP, Zhang S, Souaiaia T, Evgrafov OV, Knowles JA, Duan J*, Wang K*: Deconvolution of Transcriptional Networks Identified TCF4 as a Master Regulator in Schizophrenia. Sci Adv 5(9): eaau4139, September 2019.

Fang L, Kao C, Gonzalez MV, Mafra FA, Pellegrino R, Li M, Wenzel S, Wimmer K, Hakonarson H, Wang K: LinkedSV for detection of mosaic structural variants from linked-read exome and genome sequencing data. Nat Commun 10(1): 5585, December 2019 Notes: DOI: 10.1038/s41467-019-13397-7.

Liu Q, Fang L, Yu G, Wang D, Xiao CL, Wang K: Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nat Commun 10(1): 2449, June 2019 Notes: doi: 10.1038/s41467-019-10168-2.

back to top
Last updated: 04/22/2025
The Trustees of the University of Pennsylvania