Tuesday, March 26, 2019

Shining a light on human acrocentric p-arms


The GRC is excited to announce that representations for the p-arms of the human acrocentric chromosomes can now be found in the GRCh38.p13 patch update of the reference genome, thanks to work done in Brian McStay's lab. These sequences are included on the following scaffolds: ML143366.1, ML143367.1, ML143372.1, ML143377.1, and ML143380.1.

The p-arms of the human acrocentric chromosomes HSA13-15, 21 and 22 each bear ribosomal gene arrays (Figure 1) termed nucleolar organiser regions (NORs). These are the most transcriptionally active regions of the genome and direct formation of nucleoli, the largest structures in the nuclei of all human cells. Research on these critical genomic regions is hampered by the fact that acrocentric p-arms are not included in human genome drafts. They are both internally highly repetitive and share a strikingly similar sequence content, making them recalcitrant to standard sequencing approaches. Despite these issues, Brian McStay's lab previously described a collection of sequenced cosmid and BAC clones that allowed them build a reasonable consensus for sequences both immediately proximal and distal to NORs (Floutsakou et al. 2013. Genome Res 23:2003-12). Proximal sequences are almost entirely segmentally duplicated, similar to regions bordering centromeres. In contrast, the distal sequence is predominantly unique to the acrocentric p-arms. Their interphase localisation, open chromatin structure and transcriptionally active state, point to a role in nucleolar biology and prioritise their inclusion in a future genome draft (for discussion see McStay. 2016 Genes Dev. 30:1598-610).

The McStay lab subsequently developed a workflow that has enabled them to determine the NOR distal sequence, the Distal Junction (DJ) from all five acrocentric chromosomes and from an additional two versions of HSA21, ~3 Mb in all. A panel of mono-chromosomal somatic cell hybrids, mouse A9 cells containing individual human chromosomes, allowed them to sequence one chromosome at a time. Sequencing was performed by combining sequence capture with PacBio SMRT sequencing. Pre-capture libraries (typically in the range of 4-6 kb) were prepared from each hybrid line. Capture was performed using oligonucleotide libraries designed using their original consensus. Circular consensus sequencing (CCS) of post-capture libraries generated so called reads of insert (ROIs) each with high sequence accuracy. This allowed the McStay group to assemble sequence contigs from the NOR distal region of each chromosome, regardless of the presence of repetitive sequences such as satellite DNA.

Their analysis of these sequences confirms sequence and presumably functional conservation between the acrocentrics. It also provides evidence for non-homologous exchanges between them. It's anticipated that extension of sequence contigs towards the telomeres will uncover increased structural variation between the acrocentric chromosomes.

Figure 1. FISH experiment showing the relative locations of the rDNA array and distal junctions on the p-arms of the human acrocentric chromosomes.