Friday, February 2, 2024

GRCr8: A new rat reference assembly is released!

GRCr8 (GCA_036323735.1), the latest version of the rat reference genome assembly, is now available. GRCr8 is an evolution of mRatBN7.2 (GCA_015227675.2), the Vertebrate Genomes Project-generated rat assembly that was the first reference for this species to be adopted by the GRC for stewardship. mRatBN7.2 was an assembly of a Brown Norway (BN) male rat from the same colony at the Medical College of Wisconsin that supplied the female rat used in the 2004 RGSC_v3.4 assembly (AABR00000000.3/GCF_000001895.1). While the assembly of mRatBN7.2 was a substantial improvement over prior versions (, advances in sequencing technology and assembly and curation methods since its release in 2020 have for the first time resulted in the GRC releasing a new de novo assembly as a reference update instead of curating issues in the prior version. 

GRCr8 was generated by Dr. Peter Doris (University of Texas Health Science Center at Houston) with colleagues Theodore Kalbfleisch (University of Kentucky) and Melissa Smith (University of Louisville) in the NHGRI-funded “Inbred Rat Genomes Project”. The assembly is based on PacBio HiFi sequences from a BN/NHsdMcwi male rat. The assembly was gap filled using contigs from the PacBio CLR reads produced for mRatBN7.2. Additional short read genomic sequence from a BN/NHsdMcwi rat in the Hybrid Rat Diversity Program at the Medical College of Wisconsin were used for assembly polishing. In addition to yielding a consensus quality score (QV) of 59.5, GRCr8 addresses structural limitations of mRatBN7.2. The genome size is increased from 2.63Gb to 2.81Gb, largely because of incorporation of genomic regions that show structural expansion. For example, chrY has increased from 18Mb in mRatBN7.2 to 60Mb in GRCr8. Accompanying multi-tissue single molecule transcript information (PacBio IsoSeq) is available for this assembly (BioProject: PRJNA1027884). These data extend the scope of rat transcript diversity and will inform gene expression in newly incorporated regions of the genome.

The GRCr8 assembly has been submitted to the INSDC, making it available through GenBank, ENA and DDBJ. It will subsequently be annotated by groups such as RefSeq and Ensembl, after which it will be available on genome browsers at various resources, including the Rat Genome Database, NCBI, Ensembl, and UCSC.