Abstract: In this talk, I will discuss two on-going projects that apply different techniques to address the curse-of-dimensionality issues encountered in regression analysis of genomic data. In the first project, we introduce a low-rank tensor interaction model for detecting gene-gene interactions (GxG). While joint analysis (i.e., multi-locus model with all main effects and G×G) is preferred over marginal analysis (i.e., single-locus tests with Bonferroni corrections), joint analysis is often impartible due to the ultra-large number of G×G. The proposed tensor regression imposes certain structure on the space of GxG coefficients, approximates the space using a low-rank matrix to achieve parsimonious parameterization, and identifies important main and GxG terms using LASSO. We discuss how the proposed tensor regression can be incorporated into the Screen-and-Clean approach and obtain p-values for the identified variables in high dimensional model. Numerical studies based on simulated and real data suggest the proposed sparse low-rank regression increases the efficiency and accuracy in detecting gene-gene interactions comparing to conventional methods. In the second project, we introduce a kernel machine regression method to study rare CNV effects. Due to modest marginal effect size or rarity of the CNVs, collapsing approaches could be important to study how CNVs impact disease risk. While a plethora of powerful methods are available for SNP collapsing analysis, these methods could not be directly applied to CNVs due to the CNV-specific challenges: (1) the multi-faceted nature of CNV polymorphisms (e.g., CNVs vary in size, type, dosage, and details of gene disruption), and (2) etiological heterogeneity (e.g., heterogeneous effects of duplications/deletions). Existing burden tests tends to have suboptimal performance due to ignoring heterogeneity and evaluating only marginal effects of a CNV feature. We introduce a collapsing method for CNVs based on the kernel machine framework; it collectively examines the effects of multiple CNV features and is robust to multiple types of heterogeneity. Multiple confounders can be simultaneously corrected. We demonstrate the robustness, validity and utility of the proposed approaches using real data applications and simulations.
Stephanie Pugh, PhD Senior Statistician CCOP, Symptom Management American College of Radiology
Gary Smith, MA, D.Phil
Professor of Animal Biology, Department: Animal Biology, University of Pennsylvania
Title: “Cats, Parasites and Schizophrenia: Determining Incidence In At-Risk Groups Using Case Control Study Data”
"On the Definition and Estimation of the Causal Effect of a Continuous Exposure: Theory and Applications"
Iván Díaz, PhD Postdoctoral Fellow Department of Biostatistics Johns Hopkins Bloomberg School of Public Health
Abstract: The definition of a causal effect typically involves counterfactual variables resulting from interventions that modify the exposure of interest deterministically. A stochastic intervention generalizes this concept to define counterfactuals in which the post-intervention exposure is stochastic rather than deterministic. In this talk I will present a new approach to causal effects based on stochastic interventions. I will focus on an application of this methodology to the definition and estimation of the causal effect of a shift of a continuous exposure. This parameter is of general interest since it generalizes the interpretation of the coefficient in a main effects regression model to a nonparametric model. I will discuss three estimators of the causal effect: an inverse probability of treatment weighted (IPTW) estimator, and augmented IPTW estimator, and a targeted minimum loss based estimator (TMLE). I will discuss the methods in the context of an application to the evaluation of the effect of physical activity on all-cause mortality in the elderly.
"The Pharmacogenetics of Alcohol Dependence Treatment", Henry Kranzler, MD,Professor, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, 252 BRB (Jane Glick Classroom)
"Statistical Power Boosts for Randomized Clinical Trials"
Devan Mehrotra, PhD Executive Director (Biostatistics) and Presidential Fellow Merck Research Laboratories
Abstract: Increasing cost pressures in the pharmaceutical industry necessitate the development of novel approaches for making clinical development cheaper and faster while maintaining strong scientific rigor. From a statistical perspective, tangible progress can be made, for example, by developing and deploying innovative analyses of randomized clinical trials that are substantially more powerful than their traditional counterparts. Any increase in statistical efficiency readily translates to a reduction in the sample size (and hence time) required to address the objectives of the given clinical trial. In this talk, I will illustrate examples of resource-saving statistical innovation across all phases of clinical drug development, with a focus on more efficient use of baseline data in crossover trials, earlier detection of pharmacogenomics signals, and enhanced analyses of stratified time-to-event trials.