A deep intronic recurrent CHEK2 variant c.1009-118_1009-87delinsC affects pre-mRNA splicing and contributes to hereditary breast cancer predisposition

Germline CHEK2 pathogenic variants confer an increased risk of female breast cancer (FBC). Here we describe a recurrent germline intronic variant c.1009-118_1009-87delinsC, which showed a splice acceptor shift in RNA analysis, introducing a premature stop codon (p.Tyr337PhefsTer37). The variant was found in 21/10,204 (0.21%) Czech FBC patients compared to 1/3250 (0.03%) controls (p = 0.04) and in 4/3639 (0.11%) FBC patients from an independent German dataset. In addition, we found this variant in 5/2966 (0.17%) Czech (but none of the 443 German) ovarian cancer patients, three of whom developed early-onset tumors. Based on these observations, we classified this variant as likely pathogenic.

Importantly, CHEK2 is the second most frequently altered BC predisposition gene in female BC patients of European ancestry, surpassed by BRCA2 and followed by BRCA1 [2,3].Frequency of CHEK2 GPV (truncations, splicing alterations, and large copy number variations) is approximately 1.1-1.3% in unselected female BC patients from Europe and the USA [2,3].Additionally, 0.5% of BC patients may be carriers of rare missense CHEK2 GPV [13].
In this study, we have identified previously unreported frequent deep intronic CHEK2 GPV, characterized its effect at the RNA level, and provided evidence for its contribution to increased BC risk.

Identification of c.1009-118_1009-87delinsC CHEK2 variant.
We have re-analyzed next generation sequencing (NGS)-based anonymized data from 10,204 female BC and 2966 ovarian cancer (OC) Czech patients clinically tested using the CZECANCA panel (including CHEK2) within the Czech consortium of diagnostic laboratories (www.czecanca.cz) [14][15][16][17].We have specifically searched for deep intronic germline CHEK2 (NM_007194.4)variants localized outside the canonical intronic splice sites.Impact of identified CHEK2 variant on pre-mRNA processing was analyzed by CZECANCA panel-based total RNA sequencing as described previously [18,19].RNA was extracted from peripheral blood leukocytes with/without nonsense-mediated decay (NMD) inhibition with cycloheximide (final concentration 200 μg/mL) for 4 h.For variant burden analysis (two-sided Fisher's exact test), we used data from 3250 unselected Czech female population-matched controls (PMC) analyzed by the same NGS approach.
Variant frequency was independently assessed by analysis of data from 3639 and 443 German BC and OC patients, respectively (described previously [20]) using GATK HaplotypeCaller variant calling.As the variant's localization was outside (but close to) the corresponding sequencing target region of the TruRisk® panel applied (Human hg19 chr22:29092810-29093050), sufficient read depth (≥30) at chr22:29093063 was ensured using the samtools mpileup utility prior to variant calling.
All participants gave informed consent to germline genetic testing approved by ethics committees.

Results
By bioinformatic re-analysis of panel NGS data from Czech BC/OC patients, we identified previously unreported deep intronic CHEK2 variant localized 87bp apart from the 3′-end of intron 9.It consisted of a 32bp deletion replaced by a single cytosine: c.1009-118_1009-87delinsC (Fig. 1A), which we confirmed by Sanger sequencing (Fig. 1B).Sanger sequencing and NGS-based total RNA analysis from available variant carriers' samples showed its clear effect on CHEK2 pre-mRNA splicing.Compared to the wild-type mRNA sequence (Fig. 1C), the variant allele leads to the use of upstream alternative splice acceptor at position c.1009-142 with consequent retention of terminal part of intron 9 at the beginning of exon 10 (r.1008_1009ins1009-142_1009-1del1009-118_1009-87insC).At the protein level, this aberrant CHEK2 transcript results in premature termination of translation (p.Tyr337PhefsTer37).Importantly, proportion of this aberrant CHEK2 transcript varied between 0.07 and 0.26 in RNA samples (Fig. 1D), which suggests either its NMD-mediated degradation or only partial effect of investigated variant on pre-mRNA splicing.NMD inhibition in samples from variant carriers resulted in significant increase of aberrantly spliced mRNA to the proportion >0.46 (Fig. 1E and F), confirming its partial degradation via NMD and indicating that majority (>90%) of variant allele generates aberrant transcripts.

Discussion and conclusions
We have characterized a recurrent germline CHEK2 deep intronic variant c.1009-118_1009-87delinsC and showed that it leads to the formation of aberrantly spliced CHEK2 mRNA partially subjected to NMD, and encodes a functionally impaired prematurely terminated protein.Furthermore, we showed that this variant was overrepresented in female BC patients with frequency similar to other two Slavic founder splicing CHEK2 GPV c.444+1G > C and c.846 + 4_846+7del described in Czech population previously [21].Importantly, germline c.1009-118_1009-87delinsC positive BC patients had characteristics typical for CHEK2 GPV carriers (ER-positive tumors, cancer multiplicity, and positive family history of cancer) [22].Moreover, identified CHEK2 variant was associated with increased risk of female BC development (OR = 6.7).On the other hand, we believe that the risk level was overestimated due to lower number of PMC and may be similar as for CHEK2 pathogenic truncations (OR = 2.47-2.54)or missense variants (OR = 2.83) [2,3,13].There is generally conflicting evidence for OC predisposition and CHEK2 GPV [13,21].Nevertheless, we recurrently observed c.1009-118_1009-87delinsC in OC patients, including early-onset patients who may otherwise represent a specific OC subgroup with an unusually low proportion of GPV in established cancer predisposition gene [23].
In conclusion, the CHEK2 variant c.1009-118_1009-87delinsC results in an aberrant mRNA transcript containing premature termination codon (p.Tyr337PhefsTer37), producing a functionally impaired CHK2 kinase isoform (ACMG code PS3 -moderate) [13].The mRNA transcript is partially subjected to NMD [24,25].The variant was significantly enriched in BC patients (ACMG code PS4 -strong) with a phenotype typical for known CHEK2 GPV.This led us to classify the c.1009-118_1009-87delinsC variant as likely pathogenic.Further studies are necessary to confirm its clinical implications and to establish its prevalence in other populations.Our study highlights the critical importance to focus on intronic regions beyond the canonical ±1/2

Fig. 1 .
Fig. 1.Characterization of the c.1009-118_1009-87delinsC CHEK2 variant.(A.) NGS-based DNA sequencing visualized in the Integrative Genomics Viewer (IGV).The dashed gray lines indicate the deletion borders, and the dashed blue arrow denotes the deletion of 32bp following the nucleotide c.1009-86 at the 3′-end of intron 9 with the insertion of cytosine.(B.)DNA Sanger sequencing of the variant and wild-type samples.(C-E.)RNA (cDNA) Sanger sequencing of wild-type sample (C.), variant sample without (D.) and with (E.) NMD inhibition (cycloheximide), showing the increase of aberrant splicing variant signal peaks after NMD inhibition.(F.) RNA panel NGS from a wild-type control (top) and from a carrier of the c.1009-118_1009-87delinsC variant after NMD inhibition (bottom).Note the difference between wild-type and variant RNA in number (coverage) of intronic retentions (red dashed-line boxes) resulting from the aberrant pre-mRNA splicing.Solid blue arrows indicate sequencing context of aberrant reads showing 86b retained from intron 9, interrupted by 32b deletion replaced by single cytosine insertion, followed by 24b from intron 9: r.1008_1009ins1009-142_1009-1del1009-118_1009-87insC.We hypothesize that the reassembled primary transcript enhances the pre-existing alternative acceptor splice site TT|ga in intron 9, which precedes the canonical acceptor splice site upstream of exon 10.Note: The IGV visualizes the CHEK2 sequence in a reverse complement according to the CHEK2 gene reverse orientation on the chromosome 22.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Table 1 Clinical and histopathological characteristics of breast cancer
(top and middle) and ovarian cancer(bottom)