Thursday, November 27, 2014

Optical Mapping data in the GRC track hub

The GRC track hub now includes Optical Mapping analysis information.

What is Optical Mapping?
Optical Mapping (OM) is a method to produce ordered restriction maps from single DNA molecules (rMaps).  These rMaps are assembled into consensus maps which can be aligned against the reference assembly, taking into account the positioning of restriction sites and length of fragments. OM aids the scaffolding of genomic sequence and the identification of errors in genome assemblies, but it is also very helpful in confirming assembled contigs and sizing gaps.

What OM data is available?
OM data is currently available for human (Teague et al., 2010) and mouse (Church et al., 2009) and we would like to thank Steve Goldstein and David Schwartz for providing the alignments to the respective reference assemblies.

What is displayed in the GRC track hub?
The OM data is divided into several tracks in the GRC track hub. These tracks are of three types:
  • OM alignment tracks show the alignments of consensus maps to the reference genome, based on the comparison of restriction patterns. Each track of this type corresponds to an analysis of OM data from a single cell line.
  • OM deletion tracks present the locations of additional restriction fragments that have no corresponding fragment in the reference assembly. Their position is defined by the remaining alignment of the respective consensus map. Again, each track of this type corresponds to an analysis of OM data from a single cell line.
  • Each assembly also has a single OM reference track, which presents the set of OM fragments that would be expected based on the reference sequence, produced via an in silico restriction digest.
How is this information visualised?
The way that OM analysis data is displayed is slightly different for each of the types of track mentioned above (alignments, deletions, and predicted fragments based on the reference).
  • The OM alignments tracks present each contig as a horizontal line, with restriction cut-sites dividing fragments being displayed as vertical lines along that contig. Where there is a space between the placement of successive restriction fragments according to this analysis, this is represented as a thicker vertical bar spanning the gap between the fragments.
  • The OM deletions tracks use a single vertical bar to show the location of each fragment or group of fragments with no corresponding fragment in the reference assembly. The size of the fragment is not represented by the glyph in the browser, but is shown as one of its data fields.
  • The OM reference track displays the cut-sites between expected restriction digest fragments as vertical lines.
Display examples
In the Ensembl genome browser (from version 78 onwards), the OM reference track is at the top, with the OM deletions track "OM gap 15510" below it, followed by three OM alignments tracks based on different cell lines. (Note that OM deletions tracks exist for all those cell lines which have alignments tracks, but only one OM deletion track has data at this location.)

Here is how it appears in the UCSC genome browser. The tracks are in the same order as for the Ensembl example above: the OM reference track is at the top, with the OM deletions track "OM gap 15510" below it, followed by three OM alignments tracks based on different cell lines.