Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

A statistical approach for inferring the three-dimensional structure of the genome

Abstract : Recent technological advances allow the measurement, in a single Hi-C experiment, of the frequencies of physical contacts among pairs of genomic loci at a genome-wide scale. The next challenge is to infer, from the resulting DNA-DNA contact maps, accurate three dimensional models of how chromosomes fold and fit into the nucleus. Many existing inference methods rely upon \emph{multidimensional scaling} (MDS), in which the pairwise distances of the inferred model are optimized to resemble pairwise distances derived directly from the contact counts. These approaches, however, often optimize a heuristic objective function and require strong assumptions about the biophysics of DNA to transform interaction frequencies to spatial distance, thereby leading to incorrect structure reconstruction. We propose a novel approach to infer a consensus three-dimensional structure of a genome from Hi-C data. The method incorporates a statistical model of the contact counts, assuming that the counts between two loci follow a Poisson distribution whose intensity decreases with the physical distances between the loci. The method can automatically adjust the transfer function relating the spatial distance to the Poisson intensity and infer a genome structure that best explains the observed data. We compare two variants of our Poisson method, with or without optimization of the transfer function, to four different MDS-based algorithms---two metric MDS methods using different stress functions, a nonmetric version of MDS, and ChromSDE, a recently described, advanced MDS method---on a wide range of simulated datasets. We demonstrate that the Poisson models reconstruct better structures than all MDS-based methods, particularly at low coverage and high resolution, and we highlight the importance of optimizing the transfer function. On publicly available Hi-C data from mouse embryonic stem cells, we show that the Poisson methods lead to more reproducible structures than MDS-based methods when we use data generated using different restriction enzymes, and when we reconstruct structures at different resolutions.
Complete list of metadatas

Cited literature [26 references]  Display  Hide  Download

https://hal-mines-paristech.archives-ouvertes.fr/hal-00937182
Contributor : Jean-Philippe Vert <>
Submitted on : Tuesday, January 28, 2014 - 12:44:14 AM
Last modification on : Tuesday, November 17, 2020 - 10:42:04 AM
Long-term archiving on: : Sunday, April 9, 2017 - 12:53:18 AM

File

techreport.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00937182, version 1

Citation

Nelle Varoquaux, Ferhat Ay, William Noble, Jean-Philippe Vert. A statistical approach for inferring the three-dimensional structure of the genome. 2014. ⟨hal-00937182⟩

Share

Metrics

Record views

1439

Files downloads

676