CYP2D6 (Gene ID: 1565) is a gene associated with the metabolism of ~25% of clinically prescribed drugs, including antidepressants, neuroleptics and opioids. CYP2D6 is located on chromosome 22 (22q13.1), near two cytochrome P450 pseudogenes CYP2D7 and CYP2D8P. Because of its functional importance, the GRC updated the chromosomal representation for CYP2D6 to the sequence-corrected clinical standard (CYP2D6*1A) in GRCh38. The version of CYP2D6 found in the prior assembly version, GRCh37, was retained in GRCh38 as an alternate loci scaffold (KI270928.1).
The Genome Reference Consortium, in conjunction with the Pharmacogenomics Research Network, has also sought to identify and provide reference assembly representation for structural variation at the CYP2D6 locus. Much of this work was done by examining the end-sequence alignments of different fosmid libraries to the reference (Kidd et al.). As the reference assembly was known to represent CYP2D6, CYP2D7 and CYP2D8P each in single copy, we could ascertain potential duplication and deletion alleles of these genes by identification of discordant fosmid end-sequences (Figure 1).
|Figure 1. Alignment of fosmid ends from the ABC12 library to GRCh38 chr. 22 in the vicinity of CYP2D6. Lines connect ends belonging to the same clone. Concordant placements (length within 3 standard deviations of the library insert average and inward facing ends) are shown in blue; clones with discordant placements are in red.|
As of the GRCh38.p7 assembly release, there are 3 alt loci and 3 novel patch scaffolds that provide representation for significant structural variation in the CYP2D6 region (Table 1). Figure 2 shows the alignment of these scaffolds to the reference chromosome, highlighting the diversity in the variant representations. An example of a CYP2D6 triplication haplotype is shown in Figure 3.
|Table 1: Scaffolds providing alternate sequence representations for the CYP2D6 region, as of GRCh38.p7.|
The inclusion of these additional representations for the locus in the reference assembly is intended to help in the evaluation of CYP2D6 variant alleles from other samples. The variant scaffolds can be included in the target assembly when using an alternate aware aligner, such as bwa-mem or SRPRISM, to align reads and should enable the identification of the haplotype that is the closest match to the query sample.
|Figure 3. Graphical view of patch scaffold NW_014040931.1, with the alignment of CYP2D6, revealing a haplotype containing a triplication of this locus.|