Background/objectives: Providing summary recommendations regarding self collection of vaginal specimens for human papillomavirus (HPV) testing is difficult owing to the wide range of published estimates for the diagnostic accuracy of this approach. To determine summary estimates from analyses of reported findings of the sensitivity, specificity and summary receiver operating characteristic curves (SROC) for self collected vaginal specimens for HPV testing compared to the reference standard, clinician collected HPV specimens.
Methods: Standard search criteria for a diagnostic systematic review were employed. Eligible studies were combined using a random effects model and summary ROC curves were derived for overall and for specific subgroups.
Results: Summary measures were determined from 12 studies. Six studies where patients used Dacron or cotton swabs or cytobrushes to obtain samples were pooled and had an overall sensitivity of 0.74 (95% CI 0.61 to 0.84) and specificity of 0.88 (95% CI 0.83 to 0.92), with diagnostic odds ratio of 22.3 and an area under the curve of 0.91. Self specimens using Dacron or cotton swabs or cytobrushes collected by women enrolled at referral clinics had an overall sensitivity of 0.81 (95% CI 0.65 to 0.91) and specificity of 0.90 (95% CI 0.80 to 0.95). Sensitivity and specificity of tampons ranged from 0.67–0.94 and 0.80–0.85 respectively.
Conclusions: Our findings indicate that the combined sensitivity for HPV-DNA is more than 70% when patients use Dacron swabs, cotton swabs, or cytobrushes to obtain their own vaginal specimens for HPV-DNA evaluation. Self collected HPV-DNA swabs may be an appropriate alternative for low resource settings or in patients reluctant to undergo pelvic examinations.
- AUC, area under the curve
- HPV, human papillomavirus
- PCR, polymerase chain reaction
- ROC, receiver operating characteristic
- SROC, summary receiver operating characteristic
- human papillomavirus
Statistics from Altmetric.com
- AUC, area under the curve
- HPV, human papillomavirus
- PCR, polymerase chain reaction
- ROC, receiver operating characteristic
- SROC, summary receiver operating characteristic
Although largely preventable, cervical cancer remains a common worldwide malignancy.1 The widespread screening for cervical cancer by Papanicolaou (Pap) smear has led to a substantial decrease in the prevalence of the disease, but this screening method has recognised limitations, including poor interobserver reproducibility,2 limited correlation with disease process,3,4 and poor uptake by women of lower socioeconomic status5 and women deemed at risk.6 In addition, Pap screening programmes require significant infrastructure and resources.7 All of these limitations have prompted the search for improved methods of screening for cervical cancer.
Human papillomavirus (HPV) is now well established as the necessary but insufficient cause of cervical cancer.8 Testing for HPV-DNA has been recommended both as part of screening for cervical cancer9 and for management of women with low grade cervical cytological abnormalities.10 HPV-DNA can be detected in cervicovaginal specimens by signal amplification techniques, such as Hybrid Capture II HPV-DNA assay (Digene Corporation) or nucleic acid amplification with polymerase chain reaction (PCR). The ability to reliably detect high risk HPV-DNA in cervicovaginal secretions has significant clinical implications since women who do not have HPV-DNA are unlikely to develop cervical cancer.11 Recent studies confirmed that combined HPV-DNA and Pap testing had a sensitivity of almost 100% for cervical intraepithelial neoplasia 2 and 3 (CIN2/3).12 In contrast, the Pap test or HPV-DNA test alone had respectively sensitivities of 60% and 85%13 for high grade lesions, confirming the diagnostic advantage offered by using both tests together for cervical cancer screening.
One possible advantage that HPV-DNA testing offers for cervical cancer screening programmes is the method of specimen collection. While Pap smear collection requires a pelvic examination, collection of a vaginal specimen for HPV-DNA testing can be performed by the patients themselves. In resource limited settings, patient collected specimens for HPV-DNA might be acceptable as the primary screening test for cervical cancer, thus decreasing the need for practitioners to conduct screening. However, in societies with adequate healthcare resources, self collected HPV-DNA testing could be combined with Pap testing to improve cervical screening. Self collection of HPV-DNA specimens may also be more acceptable in populations that have difficulty obtaining Pap smears, such as abuse survivors14 and women with cultural concerns.15,16 Also, annual self testing for HPV-DNA could be used to screen women from geographically isolated regions without access to regular medical care, again to determine women who need to be offered further clinical evaluation.
Numerous studies have evaluated the accuracy of testing for HPV-DNA on patient collected vaginal specimens compared to the clinician collected specimens for HPV-DNA.17–29 However, summary recommendations cannot be made from these studies, because their findings are heterogeneous and a variety of specimen collection devices have been used. Because much of the proposed value of self collected HPV-DNA testing is dependent on the consistency and quality of test operating characteristics, discordant findings limit the applicability of this test, as it is not possible to define the true diagnostic attributes of self collected specimens compared to clinician collected specimens. Given this wide range of operating characteristics, we conducted a meta-analysis comparing the accuracy of patient collected vaginal specimens for HPV-DNA with clinician collected specimens (reference standard) for vaginal HPV-DNA in order to derive the most precise summary estimates of the sensitivity, specificity, and diagnostic odds ratio. In addition, summary receiver operating characteristic (SROC) curve for the patient collected HPV-DNA specimens will be generated. The SROC curves allow for comparison of test performance over several diagnostic thresholds.
The search strategy followed established methods recommended for diagnostic systematic reviews.3,30–32 Medline (1966–2002), Embase, Cochrane Database of Abstracts of Reviews of Effectiveness, Cochrane Controlled Database of Systematic Reviews and Cochrane Central Registry of Controlled Trials were searched. Medical subject (MeSH) headings of “human papillomavirus/HPV,” “cervix neoplasms,” “cervical intraepithelial neoplasia” were exploded and combined with “self$.” The studies were then limited to English language. In addition, an expert in the field was consulted to assist in identifying any studies not found through the electronic search.
Selection of studies
Studies that met the widely accepted methodological criteria for diagnostic studies were included:
consecutively/randomly recruited women
reference (criterion) standard applied uniformly (that is, clinician collected specimen)
Hybrid Capture-II (HC-II) or polymerase chain reaction (PCR) analysis of the sample
Blinded analysis of the sample(s).
Two of the authors (GO, DP) identified and reviewed studies to be included, and agreement scores of inclusion/exclusion were calculated using Cohen’s kappa. Disagreements were resolved by consensus. Data that were abstracted from each article included number of patients tested, clinical setting (outreach, primary care, referral setting), recruitment (consecutive or random), sample type (patient and clinician), diagnostic method (PCR/HC-II), high risk, or low risk HPV evaluated, and data for a 2 × 2 table (true positive, false positive, false negative, true negative). If studies involved several self sampling methods (swab, tampon, cervicovaginal lavage), the first method described was used in the analysis. If we were unable to construct a 2 × 2 table from data available in the paper, the author was contacted in order to obtain paired data from each patient enrolled in the study.
Heterogeneity of odds ratios was determined by the Q test.32 The presence or absence of a threshold effect was determined by Spearman’s rho.30,32 The kappa value between clinician and patient collected specimens was determined for each study.33 Summary estimates of sensitivity and specificity were pooled and weighted using Meta-test software (Joseph Lau, MD, New England Medical Center, Boston, MA, USA). DerSimonian and Laird random effects model was employed for all estimates.34 Summary estimates were generated for studies that used similar swab types, while ranges for sensitivity and specificity were provided for other subgroups such as diagnostic method or recruitment site. Reference standard in each case was the clinician obtained specimen.
Heterogeneity in diagnostic studies arises from a variety of different sources, including study design and patient populations. Given the interdependent nature of sensitivity and specificity different diagnostic thresholds will provide varying sensitivities and specificities. In order to address this, receiver operating characteristic (ROC) curves were used to provide a graphical representation of the diagnostic characteristics of tests at varying thresholds. Using the method described by Moses,35 a summary ROC (SROC) was plotted for studies that used similar swab types (Dacron or cotton swab or cytobrush). Cox adjustment was employed to avoid undefined transformations,35 and outlier studies were identified by visual inspection of logit regression plots. Q* estimate was also provided. The Q* estimate is the optimal estimate of the performance of the test, corresponds to the meta-analytically estimated values of the sensitivity and specificity of the test at the point where the pooled ROC curve crosses the negative diagonal. It is the point where sensitivity and specificity are equal, and is an indicator of the proximity of the ROC curve to the upper left hand corner. Area under the curve (AUC), which is a measure of the ability of a test to assign the correct value to a random pair of infected and non-infected individuals, was calculated for each SROC from Meta-test software.36
Of the 821 studies identified in the search, 106 studies that included either clinician or self collected specimen for HPV-DNA were reviewed. Abstracts were reviewed for the entry criteria. Agreement on inclusion/exclusion of a study for the meta-analysis was κ = 0.98 (95% CI, 0.96 to 1.00), and inclusion of the disputed study was made after careful revision of the abstract. Sixteen studies were deemed eligible for the meta-analysis.17,18,19,20,21,22,23,24,25,26,27,28,29,37,38,39 However, one study was a case-control study and therefore was excluded at the time of data extraction.22 Although six studies did not provide the raw data needed for calculations, on request these data were provided by three authors.17,18,27 The remaining two authors of three studies were not able to provide raw data,37–39 leaving 12 studies to be combined (table 1).
Q test for heterogeneity was conducted overall and for subgroups. All were significant (p<0.01) with the exception of the studies enrolling women with abnormal Pap smears at referral centres. Overall, Spearman’s rho was –0.1, indicating no threshold effect.
As part of constructing the SROC, SROC logit regression plots were generated. Visual inspection of D on S regression identified one study24 as an extreme outlier, with extreme values of both D[10.2] and S[-5.1] compared to the other studies. In addition, inspection of this study demonstrated a specificity of 1.0 (95% CI 1.0 to 1.0), with no false positive specimens, compared to specificities of 0.79 to 0.94 in the other studies. This study also had a sample size of 1194 and therefore exerted a strong influence on the SROC and summary estimates. As such, estimates are presented with and without this outlier study.
Sensitivity of self collected vaginal specimens for HPV-DNA ranged from 0.56–1.00 and specificity ranged from 0.79-1.00 (table 2). The kappa values between patient and clinician obtained samples in individual studies ranged from 0.45–1.00. Six studies,17,18,20,21,25,27 including 2537 subjects where patients used Dacron or cotton swabs or cytobrushes to obtain samples, were pooled and had an overall sensitivity of 0.74 (95% CI 0.61 to 0.84), specificity of 0.88 (95% CI 0.83 to 0.92), and a diagnostic odds ratio of 22.3 (95% CI 11.7 to 42.6). When the outlier study was included,24 there were 3731 subjects included and the summary sensitivity for Dacron swabs, cotton swabs, or cytobrushes was 0.78 (95% CI 0.65 to 0.88) and the summary specificity was 0.90 (95% CI 0.85 to 0.94). The diagnostic odds ratio was 35.5, with a 95% CI 15.3 to 82.3.
Four studies where self specimens were obtained with Dacron swabs, cotton swab, or cytobrush from women recruited at referral clinics17,18,25,27 included 803 subjects and had an overall sensitivity of 0.81 (95% CI 0.65 to 0.91), specificity of 0.90 (95% CI 0.80 to 0.95), and a diagnostic odds ratio of 37.6 (95% CI 24.2 to 58.4). Three studies using tampons26,28,29 with 411 subjects had a range for sensitivity between 0.67 and 0.94 and a specificity ranging from 0.8 to 0.85. Seven studies used PCR as the diagnostic method,19,20,23,25,27,28,29 and specimen collection types included cervicovaginal lavage, Dacron swabs, cotton swabs, and tampons. Sensitivity for PCR ranged from 0.63–1.00 and specificity ranged from 0.80–1.00. Five studies used Hybrid Capture-II as the diagnostic method,17,18,21,24,26 and had sensitivities ranging from 0.56–0.93 and specificities ranging from 0.79–1.00. Three studies were conducted in an outreach/primary care setting,20,21,24 and Dacron or cotton swabs were used to obtain the sample. Sensitivity ranged from 0.56–0.93 and specificity ranged from 0.84–1.00 (table 3).
Summary ROC curves are shown in figure 1. Areas under the curve for studies using Dacron swab, cotton swab, or cytobrush was 0.91 and the Q* estimate was 0.85 (95% CI 0.79 to 0.91). In women recruited at referral centres with abnormal Pap smears, AUC was 0.93 and the Q* estimate was of 0.87 (95% CI 0.84 to 0.90).
This systematic review offers summary estimates of the diagnostic accuracy of self collected vaginal specimens using Dacron swabs, cotton swabs, or cytobrushes for HPV-DNA test compared with clinician collected samples. The overall sensitivity for self collected specimens for HPV-DNA when Dacron, cotton swabs, or cytobrushes are used was 0.74 and specificity is 0.88 compared to clinician obtained specimens using these same devices. Summary estimates increased to 0.81 and 0.90, respectively, for sensitivity and specificity when self samples are conducted in referral settings. This is likely because women with active cervical disease, as reflected by abnormal Pap tests requiring referral, are likely to have a higher burden of viral shedding, thus enabling easier detection of the HPV-DNA with the self collected specimens in this population. Tampons offered sensitivity between 0.67–0.94, but given that fewer than four studies were available, we were unable to combine them to generate a summary findings. Both PCR and HC-II offered similar ranges in terms of their sensitivities and specificities. Future studies that examine both the acceptability and the diagnostic accuracy of tampons for specimen collection would enable researchers to generate summary estimates of diagnostic accuracy for tampons, and determine if tampons offer an advantage over swabs and cytobrushes for self testing for HPV-DNA.
One outlier study was both included and excluded from the overall estimate.24 This study had extreme values of both D and S, a specificity of 1.00, and no false positive samples in a study with over 1100 patients enrolled. Given its large sample size, this study was strongly influential on the regression and SROC. Addition of this outlier study increased the sensitivity of Dacron swabs, cotton swabs, and cytobrushes from self sampling for HPV-DNA from 0.74 to 0.78. Given its strong influence on findings, the SROC and findings reported with the inclusion of this outlier study should be interpreted with caution.
Self collection of vaginal specimens for HPV DNA testing has been proposed as one solution to address cervical cancer screening in resource limited settings in less developed countries.21 It has been shown in cost effectiveness modelling in a low resource setting in the Republic of South Africa that once in a lifetime screening with any one of several imperfect methods (cervical cytology, HPV-DNA testing, visual inspection with acetic acid) would have a substantial impact on cervical cancer mortality.40 This systematic review can contribute to further cost effectiveness modelling for low resource settings with this summary estimate of sensitivity and specificity of self sampling for HPV-DNA.
Although HPV-DNA testing is recommended as an adjunct to Pap smear testing and not as a sole screening method for cervical cancer, self collection of HPV-DNA can still play a part in screening programmes for cervical cancer in developed countries. With many women reluctant or even unwilling to undergo a pelvic examination because of cultural or personal concerns,14,15,16 a self collected sample may be offered to these patients. Assuming that women are fully informed of the limitations of a self collected sample compared to a specimen obtained by a clinician, if the alternative for these patients is to not have a Pap test, then self sampling offers a compromise where some information about cervical cancer risks can be determined. With the summary findings determined in this analysis, the increased uptake that may occur with self-collected swabs41,42 can be considered and quantified versus the loss of diagnostic accuracy as a result of using a self collected method for the population of women unwilling to undergo a pelvic examination.
Self sampling also offers an attractive option to facilitate the collection of specimens for HPV-DNA in women in geographically isolated areas in developed country settings such as Canada or Australia.43 Women who live in settings where they do not have access to regular healthcare providers could conduct repeated self testing over years, similar to Pap screening programmes, in order to identify their own risk for cervical dysplasia. Women identified as at risk could in turn be brought down to regional centres for further evaluation and, potentially, treatment. Clinical decision models evaluating the use of repeated self testing could be conducted with the findings of this review, to determine the risks and advantages offered by annual self testing for HPV-DNA compared to limited access to Pap testing that currently occurs in remote regions of developed countries.
This meta-analysis was conducted using widely accepted statistical methods for combining studies evaluating the diagnostic accuracy of tests and the summary estimates provided in this study were based on over 2500 samples. However, three studies which would have potentially offered important data37–39 were not included, as the authors were unable to provide raw data. In addition, although the studies had international settings, this meta-analysis only included studies that were published in English. Potentially, there were additional studies that would have enhanced the estimates provided.
This meta-analysis has provided a summary estimate of sensitivity and specificity for self collected specimens compared with clinician collected specimens for HPV-DNA. These findings are expected to enable clinicians and researchers to assess the benefits of self collection of vaginal specimens in terms of the potential increased acceptability and uptake of HPV-DNA testing compared to the loss of diagnostic accuracy inherent in this mode of specimen collection.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.