Murray Lab

Embryo data & corresponding Cell Lineage tree

Ensuring robust and precise development


The goal of the Murray laboratory is to understand how genomes control animal development. Animals, including humans, consist of many highly specialized cell types which must be generated in the correct number and at the right time and place to allow the organism to function correctly. Most of our understanding of this process comes from studies that focus on individual cell types or tissues, limiting our understanding of how regulators and mechanisms apply across all cells in an organism. The Murray laboratory develops and uses whole-organism live-cell imaging, and genomics methods, combined with classical genetics, to study gene regulation across the entire embryo. We use the nematode worm Caenorhabditis elegans as a model for many of our studies because it shares most of its cell types and regulatory mechanisms with humans, but is much easier to study.  



A lineage-resolved molecular atlas of C. elegans embryogenesis at single-cell resolution.

Read this paper

Read an article about this paper

clustered single cell RNA expression data

3D projection of single cell RNA expression data from C. elegans embryos


We profiled the transcriptomes of 86,024 single cells from C. elegans embryos at developmental stages ranging from gastrulation to terminal cell differentiation. Using computational methods, gene expression patterns from the literature, and gene expression data obtained from three-dimensional (3D) movies of fluorescent reporter lines, we mapped each single-cell transcriptome to its corresponding position in the known C. elegans cell lineage tree. In total, we identified 502 distinct terminal and preterminal cell types, which correspond to 1068 individual branches of the lineage tree. We computed a transcriptional profile for each detected cell type and determined the gene expression differences between mother and daughter cells, and between sister cells, for >200 cell division events in the lineage.

Analyzing these data, we find that:

1) A cell’s lineage history and its transcriptome are transiently correlated. This correlation increases from middle to late gastrulation, then falls substantially as cells adopt their terminal fates.

2) Genes that distinguish sister cells are often first coexpressed in the parent and then selectively retained in one daughter but not the other. This phenomenon, known as “multilineage priming,” is notably prevalent throughout the C. elegans lineage.

3) Most distinct lineages that produce the same anatomical cell type converge to a homogenous transcriptomic state, with little or no residual signature of their lineage identity.

4) In many cases, purely computational reconstruction of developmental trajectories from the single-cell transcriptomic data does not accurately reproduce the known cell lineage. Marker genes known to be expressed in specific lineages were critical for correct annotation. This is particularly evident for lineages in which gene expression changes rapidly.


Our dataset defines the succession of gene expression changes associated with almost every cell division in an animal’s embryonic cell lineage. It provides an extensive resource that will guide future investigations of gene regulation and cell fate decisions in C. elegans. It can also serve as a benchmark dataset that will facilitate rigorous evaluation of computational methods for reconstructing cell lineages from sc-RNA-seq data.



Want to join us?

Contact John to apply for a postdoctoral position. We are affiliated with several graduate programs at Penn including:


Highlights of Philly: