skip to primary navigationskip to content

Martin Hemberg

2017 HembergMartin Hemberg PhD, Associate Group Leader at the Gurdon Institute and Career Development Fellow Group Leader at the Wellcome Sanger Institute.

Europe PMC Pubmed




Computational analyses of large genomic datasets

What can sequencing data tell us about disease? To create the different cell types in an organism, different genes are expressed at different times from the whole genome as transcripts of RNA, which will include both protein-coding and noncoding species. Understanding how, why, when and where genes are expressed is crucial for understanding not just development but also many diseases.

High-throughput sequencing of RNA from different tissues can now provide insights into gene expression and related properties, but the experimental datasets are large, high-dimensional and noisy. Computational methods are required to extract maximum information from such data.

Our group uses computational analysis to develop quantitative models of gene expression and gene regulation. In particular, we are exploring single cell RNA sequencing, which can reveal insights that are inaccessible through traditional bulk experiments; for example, to estimate the number of differentiated cell types in the body.

Another strand of research aims to further our understanding of noncoding DNA – providing better models of regulatory elements and characterising non-coding RNA.

Our ongoing research projects include:

  • Inference of gene regulatory networks from single-cell RNA-seq data.
  • Characterisation of the transcriptome of individual nematodes (with the Miska lab). 
  • Characterisation of the heterogeneity of liver organoids (with the Huch lab).
  • Identification and characterization of non-canonical secondary structures in DNA. 
  • Virtual Reality technology for visualising genomic data (collaborating with HammerheadVR).

Selected publications:

• Nguyen TA et al. (2016) High-throughput functional comparison of promoter and enhancer activities. Genome Res. Jun 16. pii: gr.204834.116. [Epub ahead of print]

• Delmans M and Hemberg M (2016) Discrete distributional differential expression (D3E)--a tool for gene expression analysis of single-cell RNA-seq data. BMC Bioinformatics. 17:110.

• Kiselev VY et al. (2016) SC3 - consensus clustering of single-cell RNA-Seq data. BiorXiv pre-print published online 14 April 2016.

• Prabakaran S et al. (2014) Quantitative profiling of peptides from RNAs classified as noncoding. Nature Communications 5: 5429.

• Kim TK et al. (2010) Widespread transcription at neuronal activity-regulated enhancers. Nature 465 (7295): 182–187.




Jimmy (Tsz Hang) Lee • Nicholas Keone Lee

Based at Wellcome Sanger Institute: Tallulah Andrews • Ilias Georgakopolous-Soares • Louis-Francois Handfield • Guillermo Parada Gonzalez • Cristian Riccio • Xioajuan Shen