Clinical validation of full genotyping CLART® HPV4S assay on SurePath and ThinPrep collected screening samples according to the international guidelines for human papillomavirus test requirements for cervical screening

Background To ensure the highest quality of human papillomavirus (HPV) testing in primary cervical cancer screening, novel HPV assays must be evaluated in accordance with the international guidelines. Furthermore, HPV assay with genotyping capabilities are becoming increasingly important in triage of HPV positive women in primary HPV screening. Here we evaluate a full genotyping HPV assay intended for primary screening. Methods The CLART® HPV4S (CLART4S) assay is a newly developed full-genotyping assay detecting 14 oncogenic (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, 68) and two non-oncogenic HPV genotypes (6, 11). It was evaluated using SurePath and ThinPrep screening samples collected from the Danish and Swedish cervical cancer screening programs, respectively. For calculation of sensitivity, 81 SurePath and 80 ThinPrep samples with confirmed ≥CIN2 were assessed. For clinical specificity analysis, 1184 SurePath and 1169 ThinPrep samples from women with <CIN2 histology were assessed. Sensitivity and specificity of the CLART4S assay was compared to an established reference test; the MGP-PCR (Modified General Primers GP5+/6+ with genotyping using Luminex). Inter and intra laboratory reproducibility of the assay was assessed using 540 SurePath and 520 ThinPrep samples, respectively. The genotype concordance between CLART4S and MGP-PCR was also assessed. Results In SurePath samples, the sensitivity of CLART4S was 0.90 (MGP-PCR =0.93) and the specificity was 0.91 (MGP-PCR = 0.91); In ThinPrep samples the sensitivity of CLART4S was 0.98 (MGP-PCR = 1.00) and specificity was 0.94 (MGP-PCR =0.87). The CLART4S was shown to be non-inferior to that of MGP-PCR for both sensitivity (p = 0.002; p = 0.01) and specificity (p = 0.01; p = 0.00) in SurePath and ThinPrep samples, respectively. Intra-laboratory reproducibility and inter-laboratory agreement was met for both media types. The individual genotype concordance between CLART4S and MGP-PCR was good agreement for almost all 14 HPV genotypes in both media types. Conclusions The CLART4S assay was proved non-inferior to the comparator assay MGP-PCR for both sensitivity and specificity using SurePath and ThinPrep cervical cancer screening samples from the Danish and Swedish screening programs, respectively. This is the first study to demonstrate clinical validation of a full-genotyping HPV assay conducted in parallel on both SurePath and ThinPrep collected samples.


Background
Human Papillomavirus (HPV)-based cervical cancer screening is currently used in several countries including Netherlands, US, Denmark, Norway, Sweden, Spain and Australia, with several more countries planning for implementation. Compared to cytology, HPV based screening has superior clinical sensitivity and negative predictive value [1,2]. Today, more than 200 molecular HPV assays are commercially available [3], and clinical validation remains pivotal to ensure screening-relevant assay performance. The 2009 international guidelines on HPV test validation defined the clinical performance criteria for novel HPV assays based on performance relative to that of Hybrid Capture 2 (HC2) or GP5+/6 + −PCR, which was both validated through randomized trials [4]. Additionally, the international guidelines also defined a set of inter-and intra-laboratory reproducibility requirements to ensure clinical routine performance.
A decade later, the 2009 validation criteria remain the highest level of validation yet updates on several pivotal points could be suggested. Firstly, HPV based screening will, for a period to be, run on liquid-based cytology (LBC) collection media, most notably SurePath and ThinPrep. This allows for HPV screening and subsequent cytology triage of HPV positive samples on one and the same specimen. Yet, these medias are different in chemical formulation, and most importantly, sample collection volume. Secondly, the defined comparator assays are more or less out-phased in clinical use or modified to the point where generating a strictly compliant reference panel for validation of new HPV assay has become an undue costly and complicated affair. Thirdly, the 2009 criteria do not embrace the technological development towards assays with genotyping, in that the criteria only assesses sensitivity and specificity performance on all HPV genotypes combined, not at individual genotype level. Yet, the current state-of-the-art screening algorithms from many countries acknowledges and utilizes genotyping of at least HPV16 and HPV18, with more genotypes assigned risk and specific management as new screening algorithms are implemented.
Multiple assays have been internationally validated and/or FDA approved for one but rarely both LBC collection media [5][6][7][8][9][10][11][12][13][14], with the BD Onclarity HPV test being the exception [15][16][17]. The lack of validation on both medias represents a clear challenge for introduction of HPV based screening. Consequently, simultaneous validation of HPV tests on the two market-leading cytology sample collection systems arguably offers a more comprehensive evaluation of novel assays.
The value of genotyping is based upon evidence that HPV genotypes have different oncogenic potential [18][19][20][21][22][23]. HPV16 and HPV18 contribute to approximately 70% of all cervical cancers; the five HPV genotypes HPV 31,33,45,52 and 58 are associated with a further 19% of cervical cancers, whereas the remaining six oncogenic HPV genotypes HPV35, 39, 51, 56, 59 and 68 contribute 8-9% [18,24]. Other HPV genotypes are only rarely involved in cervical carcinogenesis, with HPV 66 categorized as possibly oncogenic [25]. On top of this, the HPV genotype specific risk of Cervical Intraepithelial Neoplasia (CIN) 3 is also age dependent. HPV16 confers the largest risk in women below 30 years of age [26], whereas HPV16 in combination with 18, 31 and 33 together constitute the highest relative risk of disease in women above 30 years of age [26][27][28][29]. Data on the absolute risk of CIN by individual HPV genotypes and the fast evolution of cervical screening technology makes it increasingly relevant to consider HPV type specific riskbased screening algorithms [19,30]. From a guideline perspective, risk stratification based on HPV16 and HPV18 is already incorporated into a number of national guidelines for triage of HPV positive screening samples [31], as standalone referral indication for colposcopy or as part of a combined outcome with cytology findings of atypical squamous cells of undetermined significance (ASCUS) or low-grade squamous intraepithelial lesion (LSIL) in certain settings [28,30,32,33].
Here, we assessed the clinical performance of the CLART4S assay relative to that of the comparator assay MGP-PCR (Modified General Primers GP5+/6+ with genotyping using Luminex (BioRad) assay) using the international guidelines for primary cervical cancer screening [4]. The CLART4S was evaluated in both SurePath and ThinPrep collected cervical cancer screening samples, from women aged 30-65 participating in the Danish and women aged 23-60 years participating in the Swedish cervical cancer screening programs, respectively.

Sample selection SurePath cervical cancer screening samples
For the specificity analysis (the "no disease" control population), 1395 residual SurePath samples were collected from Danish women ≥30 years undergoing routine cervical cancer screening at Hvidovre Hospital, Denmark. Collection of the control panel was completed in October 2016. In total, 211 samples were excluded due to one of the following reasons 1) women with previous cytological diagnosis of ASCUS within the past 15 months; 2) a cytological diagnosis of more than ASCUS (>ASCUS) in the past 12 months; 3) previous cervical cancer or CIN in the previous 3 years; or 4) insufficient/incomplete histological follow-up in the Danish register or diagnosis of ≥CIN2 follow-up after baseline analysis, 5) laboratory processing and/or technical errors. The final control population incorporated 1184 samples (mean age 43.4, range 30-65).
For the sensitivity analysis (the "disease" case population) residual SurePath material from 411 consecutive, unselected samples were collected from Danish women undergoing screening between September and October 2012 at Hvidovre Hospital. The samples were derived from women with ≥ASCUS cytology. After collection, samples with insufficient material for testing were excluded and from the remainder women ≥30 years with confirmed ≥CIN2 histology were selected, yielding 57 samples in total. In June 2016, an additional 24 samples were selected from 70 consecutive ≥ASCUS samples using the same criteria. In total, the case population consisted of 81 samples from women ≥30 years of age with confirmed ≥CIN2 (mean age 40.3, range 30 to 73).
For assay reproducibility, 474 samples included in the control population were selected. In addition, 70 samples with ≥ASCUS cytology were collected from the routine cervical screening at Hvidovre hospital, to ensure compliance with the requirement for a 30% HPV positive rate within the reproducibility element [4]. Four samples were excluded due to technical invalidity in one of three runs for the reproducibility element. In total, DNA from 540 samples were included constituting 379 MGP-PCR negative and 161 MGP-PCR positive samples. An aliquot of extracted DNA from the 540 reproducibility samples was shipped to the HPV Research Group, University of Edinburgh, who performed the inter-laboratory agreement testing.

ThinPrep cervical cancer screening samples
For the specificity analysis (the "no disease" control population), all women between 01-jan-2013 and 31dec-2015 undergoing routine cervical screening in Stockholm county, Sweden was included, in total 290, 793 samples, of these 117,365 had sample residuals stored in the Clinical Cytology Biobank, Karolinska University Laboratory, Stockholm, Sweden. Subsequently all women with both the current and previous cytology classified as normal (n = 92,695) were identified. From these, a random set of 1169 samples with sufficient material was drawn (mean age 38.3, range 30-63).
For the sensitivity analysis (the "disease" case population), all women with ≥CIN2 diagnosed in routine cervical screening in Stockholm, Sweden, who had residual samples stored in the Clinical Cytology Biobank, Karolinska University Laboratory, Stockholm, Sweden, were identified from 01-jan-2013 to 31-dec-2015 (n = 4274). From here 80 consecutive samples from women ≥23 years of age with confirmed ≥CIN2 (mean age 34, range 23-60) was selected. Of these, 21 samples were derived from women < 30 and 59 samples derived from women ≥30 years of age.
For the reproducibility analysis, samples derived from women participating in primary HPV-based screening in Stockholm, stored in the Clinical Cytology Biobank, Karolinska University Laboratory, were identified starting from 1-sep-2014 (the start of primary HPV-screening above 30) to 30-nov-2014. The first 160 consecutive samples registered as HPV-positive and the first 360 consecutive samples registered as HPV-negative were included, for a total of 520 samples. An aliquot of extracted DNA from the 520 reproducibility samples was shipped to the HPV Research Group, University of Edinburgh, where inter-laboratory agreement testing was performed. In total 491 and 495 valid samples were included in the intra and inter-laboratory reproducibility element, respectively.

DNA extraction
The MagNA Pure 96 platform (Roche diagnostics, Rotkreutz, Switzerland) with MagNA Pure LC total nucleic acid isolation kit (Roche Diagnostics) was used for both the SurePath and ThinPrep samples. For SurePath; 1 ml of material was preprocessed with heat treatment for 1 h at 56°C with proteinase K, followed by 1 h at 90°C to reverse formaldehyde-induced cross-linking prior to extraction. Extracted DNA was stored refrigerated prior to CLART4S and MGP-PCR testing. For ThinPrep, an aliquot of 100 μL from all included Thin-Prep samples was extracted, and the resulting DNA was stored in − 20°C prior to CLART 4S and (MGP-PCR) Luminex testing.

GENOMICA CLART® HPV4S assay
GENOMICA CLART® HPV4S is a PCR-based microarray assay that targets the HPV L1 region, and detects 16 individual genotypes: HPV16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, 68 and HPV6 and 11. The assay has two internal controls: one for PCR performance and one for sample sufficiency and assay performance. The internal control for PCR processing relies on amplification of a spiked CFTR plasmid, and is used to validate the individual PCR run, the internal control for human CFTR is used to validate sufficient human material in the sample. The assay is fully automated after PCR amplification using the autoclart®plus platform. In short, 5 μl aliquots of extracted DNA were used for the CLART HPV4S PCR amplification. Prior to visualization on low-density microarrays, the PCR products were denaturated at 95°C for 10 min. Visualization and reporting of genotyping results were done automatically on the Clinical Array Reader (CAR®) as part of the automated autoclart®plus workflow. All samples with an invalid result (no human CFTR amplification detected or no spiked CFTR plasmid amplification detected) were retested once, and the second result was considered definitive. The CLART4S assay run-protocol was independent of sample media. As part of the validation, a posteriori optimization of genotype specific cut off values was conducted against detection of ≥CIN2 and < CIN1, resulting in two LBC specific, optimized assay reading software versions. The final dataset was analyzed using the ThinPrep and SurePath specific assay reading software versions.

MGP-PCR and HPV typing using Luminex
All samples were HPV genotyped using MGP-PCR, primer targeting L1, and type-specific probes using Luminex detection technology, as previously described [34][35][36]. Briefly, 5 μL aliquots of extracted DNA were used in the MGP-PCR in a total volume of 25 μL. Forty-two beads, 37 different HPV-types, three HPV variants, and two 'universal' HPV probes, were included in the Luminex assay. Samples with a grey-zone result were retested in duplicate and HPV type(s) that were reproducible were considered definitive. All MGP-PCR and Luminex testing was performed at the Karolinska Institute, Stockholm, Sweden. MGP-PCR and HPV typing using Luminex was performed with the same protocol for both SurePath and ThinPrep collected samples.

SurePath procedure (Denmark)
Cytology was read following the Bethesda 2001 criteria. Hvidovre Hospital employs computer assisted screening using FocalPoint™ GS imaging system and SlideWizard™ (BD diagnostics, Burlington, NC), prior to cyto-screener review. HPV testing was performed after cytology evaluation; hence the cyto-screener was blinded to the HPV result upon evaluation, except for ASCUS cases which are reflex tested, routinely for HPV in accordance with current Danish National screening guidelines. All abnormal cytology findings were routinely adjudicated by a pathologist. According to National guidelines, women with LSIL were invited for repeat cytology testing after 6 months. Women with normal cytology were returned to routine screening after 3 years if aged 23-49 or 5 years if aged 50-59. All women included in the study were managed according to the routine guidelines for the Danish cervical cancer screening program.

ThinPrep procedure (Sweden)
Cytology was read following the Bethesda 2001 criteria. Manual screening review was performed by especially trained cyto-diagnosticians, with ambiguous cases resolved by specialist cytologist review. The Karolinska University Laboratory is the central cervical screening diagnostic laboratory for the Stockholm region, the capital region of Sweden. Within the cervical screening program, a randomized health services study was performed during 2012-2016 [37] for women aged 30-60. Half of the population was randomized to primary cytology, and half to primary HPV-based screening. In the cytology arm, ASCUS cases were routinely tested for HPV in accordance with guidelines. In the HPV arm, HPV-positive samples were tested with reflex cytology. In 2015, Sweden issued new guidelines for cervical screening recommending HPV-based screening for all women 30-64 years of age [38]. The Stockholm-Gotland region has been biobanking residuals from screening samples gradually since 2011 and all cervical screening samples since 2013 at the Clinical Cytology Biobank, Karolinska University Laboratory, Karolinska University Hospital.

Danish procedures
In Denmark, women ≥30 years with ASCUS and a concurrent HPV-positive test result are referred to colposcopy with biopsies, as are women with high-grade squamous intraepithelial lesions (HSIL), atypical squamous cellscannot exclude HSIL (ASC-H), atypical glandular cells (AGS) or cytological sign of carcinoma and women with continued ASCUS and LSIL cytological diagnosis. Danish screening guidelines requires biopsies from all aceto-white lesions or 4-counter clockwise random biopsies from all four quadrants in cases where no lesions are visible upon colposcopy. All histological data included in the study were retrieved from the Danish Pathology Data Register.

Swedish procedures
In Sweden, women are referred to colposcopy with biopsies according to similar guidelines as listed above for Denmark. For women with suspected high-grade disease, biopsies from lesions or random biopsies in the similar fashion should be performed. All histological data included in the study were retrieved from the Swedish National Cervical Screening Register.

Data analysis
For CLART4S HPV, a sample was considered positive if at least one of the 14 genotypes (16,18,31,33,35,39,45, 51, 52, 56, 58, 59, 66, & 68) was detected. HPV6, and/or HPV11 present alone without any of the other 14 HPV genotypes were considered HPV screen negative. The same was true for MGP-PCR. The CLART4S assay automatically reports genotype findings detected in an "uncertainty" range, if the visualization outcome falls close to the manufacturer cut-off. Reflecting routine practice at our facility, these genotypes are considered positive only if part of a multiple infection.
Clinical specificity and sensitivity values for CLART4S were compared to those of MGP-PCR using the noninferior score test, where non-inferiority is defined as a relative specificity for <CIN2 of ≥98% and a relative sensitivity for ≥CIN2 of ≥90%. For the intra-laboratory reproducibility and inter-laboratory agreement, a lower confidence bound of ≥87% was used as a threshold [4]. The non-inferior score excel sheet was provided by VU University, Amsterdam, The Netherlands [4]. In the Swedish study population, Fisher's exact test of homogeneity was applied to test homogeneity of distribution of HPV-status in women of < 30, and ≥ 30 years, respectively. For other statistical computations incl. 95% CI, the SPSS statistics 22 software was used.
One CIN2 and one CIN3 case were positive by CLART4S (HPV 31 and HPV 33, respectively) but negative by MGP-PCR.
For clinical validation of the CLART4S assay in Thin-Prep samples, 21 ThinPrep cervical screening samples were collected from women < 30 and 59 were collected from women ≥30 years; all with histologically confirmed  Control population defined as women with 2 x NILM and no confirmed CIN2 histology, c Case group defined as women with confirmed CIN2 or more histology ≥CIN2 lesions during follow-up (mean age: 34.0, range 23-60). Of these, 51 was CIN2, 27 CIN3, 1 AIS and 1 cancer. There was no significant heterogeneity between samples from women above and below age 30 (p = 0.46). The sensitivity of CLART4 in ThinPrep samples was 0.98 (95% CI: 0.91-1.0), compared to 1.0 for MGP-PCR (95% CI: 0.95-1.00). The sensitivity of CLART4S in ThinPrep samples was non-inferior to MGP-PCR (p = 0.01, Table 2). Two CIN2 cases were negative by CLART4S and positive by MGP-PCR (HPV59 and HPV66, respectively).

Intra-laboratory reproducibility and inter-laboratory agreement
The intra-laboratory reproducibility on SurePath samples was 95% (lower confidence bound: 0.93, kappa value: 0.87, Table 4). The inter-laboratory agreement on SurePath was 89% (lower confidence bound: 0.87, kappa value: 0.69). The reproducibility of the individual genotype results showed overall moderate to excellent agreement (range 0.50 to 1.00) for the intra-laboratory reproducibility ( Table 5). For the inter-laboratory agreement, the genotype concordance was slightly lower with poor to excellent agreement (range 0.14 to 0.87, Table 5).
The intra-laboratory reproducibility in ThinPrep samples was 92% (lower confidence bound 0.90, kappa value 0.70, Table 6). The inter-laboratory agreement was 95% (lower confidence bound 0.93, kappa value 0.81). The reproducibility of the individual genotype result was overall good for inter and intra-laboratory agreement, but with poor to excellent agreements observed dependent upon genotype assessed (range 0.00-0.92 and 0.00-1.00, respectively, Table 7).

Discussion
In this study, we validate in parallel the clinical performance on SurePath and ThinPrep-collected samples of the novel CLART4S assay to the comparator assay MGP-PCR. CLART4S is a full genotyping assay that detects 16 HPV genotypes individually. Here, the CLART4S assay was shown to have a similar clinical specificity and sensitivity performance to the comparator MGP-PCR assay for both SurePath and ThinPrep collected cervical cancer screening samples (Tables 1 and 2). The specificity (SurePath 0.01; ThinPrep 0.0) and sensitivity (SurePath 0.002; Thin-Prep 0.01) of CLART4S were non-inferior to that of MGP-PCR for both LBC collection media. The Swedish case population contained 21 samples from women < 30 years of age which is not strictly in compliance with the international criteria. Fisher's exact test of homogeneity was applied to test homogeneity of distribution of HPV-status in women of < 30, and ≥ 30 years, and found it to be similar. In consequence, the outcomes showed excellent sensitivity of CLART4 in both populations below, and above 30 years of age.
As comparator assay we used the MGP assay with Luminex which is a multiprimer system detecting at least 14 screening relevant HPV types [36]. Performance of the MGP assay showed a slightly higher sensitivity for detection of individual genotypes compared to the classical GP5+/6+ single primer pair.
At the level of overall HR-HPV detection, the CLART4S displayed intra-laboratory reproducibility and inter-laboratory agreement on SurePath collected samples within the recommended lower confidence bound of 87% (93 and 87%, respectively). However, the inter-laboratory reproducibility was borderline to acceptance which can cause quality assurance issues in non-expert laboratories. For ThinPrep, intralaboratory reproducibility and inter-laboratory agreement was 90 and 93%, respectively (lower confidence bound, kappa: 0.70 and 0.81).   At the individual genotype level for SurePath samples, kappa values ranging 0.14-1.0 was observed displaying large variation in the CLART4s performance dependent upon genotype in question. Similar results were observed for ThinPrep collected samples (range 0 to 1.00).
The individual genotype concordance for the combined 1265 SurePath samples, showed good agreement between CLART4S and the comparator assay, except for HPV66 and HPV68. For the combined 1249 ThinPrep samples genotypic concordance was good, except for genotypes HPV39, HPV59, HPV66 and HPV68. According to the IARC classification [25] HPV66 is considered a possible oncogenic and HPV68 is considered to be a probably oncogenic. Neither are frequent in cervical cancers, nor are HPV59 and 39 despite their firm inclusion in the IARC high risk oncogenic group, so overall the clinical implication of a poor concordance may be marginal.
With respect to validation of HPV assays, reproducibility of individually assay reported genotypes are rarely reported, nor required by the acceptance criteria. Reproducibility is measured only at the level of "presence/absence" of oncogenic HPV, yet this could mask substantial performance issues at the individual genotype level. To this end, a review or an update of the international criteria that incorporates an adjudication of type specific performance is timely. Arguably, this is particularly important given the increasing number of tests that have typing capability and the increasing interest in risk stratification using typing information. Another aspect that merits international discussion is how to utilize genotyping in clinical screening algorithms. A multitude of questions however remain as to how the clinical algorithms would change contingent to a result of the full typing, i.e. could women be referred for colposcopy versus re-test based upon specific risk estimates of single and multiple genotype combinations, or could full genotyping help improve screening by allowing extended follow-up for women with the lowest risk genotypes?
In this study, the sensitivity and specificity of CLART4S were slightly different between the ThinPrep and SurePath cohorts. Both collection media allow for additional testing for HPV prior to (primary screening) or subsequent to cytological evaluation (triage). Nevertheless, we argue that a substantial part of the observed assay performance difference stems from the LBC collection media. Firstly, the cellularity is not the same between the two sample collection methods, yet for CLART4S as well as all HPV screening assays, the analytical input per test has the same volume irrespectively of the sample collection media.
The SurePath vial contains 10 ml medium and the brush is left in the vial after sampling and prior to the cytology procedure. The ThinPrep vial contains 20 ml medium and the brush is rinsed in the medium and subsequently discarded. Moreover, SurePath contains a low concentration of formaldehyde added to the alcohol fixative to ensure adequate preservation of the cell material whereas ThinPrep uses methanol as the sole fixative. The difference in fixative has been a source of discussions internationally, with the claim that HPV analysis on SurePath samples can be challenged by the cross-linking between DNA and proteins driven by the formaldehyde content [42,43]. However, correct preanalytical treatment of samples counters the impact of formaldehyde [44]. Furthermore, studies with BD Onclarity [16,17], Hologic Aptima [45,46], Roche cobas [47,48] and Genomica CLART HPV2 [49] have previously shown that Sure-Path collected samples can safely be used for HPV analysis. ThinPrep collected LBC medium on the other hand does not contain formaldehyde and consequently is considered less challenging with respect to HPV testing.
Finally, we do take note that SurePath and ThinPrep population were collected from two different cervical cancer screening programs, which could also contribute to the assay performance variance observed. From an operational perspective, an outcome of this study has been that the manufacturer of CLART4s equipped two software versions on the HPV analysis platform, optimized for either SurePath and ThinPrep collected samples, respectively, which constitute an adept solution to the issue.

Conclusion
In conclusion, our data shows that the CLART4S assay is equivalent and non-inferior to the comparator MGP-PCR assay with respect to clinical sensitivity and specificity, for both SurePath and ThinPrep collected cervical cancer screening samples. Moreover, inter-laboratory reproducibility and intra-laboratory agreement fulfill the international validation criteria for both media types. Given the full genotyping capabilities of the assay, the CLART4S is a suitable candidate for future primary HPV screening. Finally, we would put forth two suggestions: 1) That all HPV assays are validated on both Thin-Prep and SurePath medias rather than to assume similar performance across both; 2) That type-specific validation metrics become part of validation criterias to accommodate the evolution of HPV assays towards individual genotype detection.