Objective: Nucleic acid amplification tests have facilitated field based STD studies and increased screening activities. However, even with highly specific tests, the positive predictive value (PPV) of such tests may be lower than desirable in low prevalence populations. We estimated PPVs for a single LCR test in a population survey in which positive specimens were retested.
Methods: The Baltimore STD and Behavior Survey (BSBS) was a population based behavioural survey of adults which included collecting urine specimens to assess the prevalence of gonorrhoea and chlamydial infection. Gonorrhoea and chlamydial infection were diagnosed by ligase chain reaction (LCR). Nearly all positive results were retested by LCR. Because of cost considerations, negative results were not confirmed. Predicted curves for the PPV were calculated for a single testing assuming an LCR test sensitivity of 95%, and test specificities in the range 95.0%–99.9%, for disease prevalences between 1% and 10%. Positive specimens were retested to derive empirical estimates of the PPV of a positive result on a single LCR test.
Results: 579 participants age 18–35 provided urine specimens. 20 (3.5%) subjects initially tested positive for chlamydial infection, and 39 (6.7%) tested positive for gonococcal infection. If positive results on the repeat LCR are taken as confirmation of a “true” infection, the observed PPV for the first LCR testing was 89.5% for chlamydial infection and 83.3% for gonorrhoea. This is within the range of theoretical PPVs calculated from the assumed sensitivities and specificities of the LCR assays.
Conclusions: Empirical performance of a single LCR testing approximated the theoretically predicted PPV in this field study. This result demonstrates the need to take account of the lower PPVs obtained when such tests are used in field studies or clinical screening of low prevalence populations. Repeat testing of specimens, preferably with a different assay (for example, polymerase chain reaction), and disclosure of the non-trivial potential for false positive test results would seem appropriate in all such studies.
- ligase chain reaction
- population surveys
Statistics from Altmetric.com
Chlamydial infection and gonorrhoea are the two most common bacterial sexually transmitted infections in the United States,1 predominantly affecting adolescents and young adults. Medical and public health interventions require accurate diagnosis, which is easily accomplished for symptomatic patients. Most gonococcal and chlamydial infections in women are asymptomatic. In male clinic populations, the literature suggests that up to 10% of gonococcal urethritis,2,3 and a third of chlamydial urethritis may be asymptomatic.4 In population based surveys, however, the proportion of asymptomatic disease is higher. Screening asymptomatic people at risk for infection has therefore evolved as a major public health control strategy.
New non-invasive nucleic acid amplification tests (NAATs), such as ligase chain reaction (LCR), have been licensed for diagnosis of gonorrhoea and chlamydial infection in urine for both males and females.5–8 Testing of urine specimens eliminates the need for clinical examination, and it provides the opportunity for STD screening in community based sites, such as schools,9 military field clinics,10 developing country settings,11 and primary care clinics. Non-invasive screening can also be used in population surveys, such as the ones recently performed in Baltimore,12 in the National Survey of Adolescent Males,13 in a pilot test of the National Health and Nutrition Examination Survey,14 and in a study of Job Corps participants.15
Use of NAAT results for STD diagnosis in population surveys or other screenings of low prevalence populations is associated with a lower positive predictive value (PPV) and higher proportion of false positive results than one encounters in clinical practice. This issue has been previously raised by Schachter and Chow.16
In this paper, we report on our experience conducting a population survey that employed LCR tests to diagnose gonococcal and chlamydial infections. We estimated PPVs based on a single LCR test and compared the theoretically predicted PPV for a single testing with the results we obtained when positive LCR tests were retested for confirmation. Discordance in test results across these two testings provides an empirical measure (albeit an imperfect one) of the predictive value of a positive result in a single LCR testing.
Baltimore STD and Behavior Survey
The Baltimore STD and Behavioral Survey (BSBS) was a population based cross sectional household survey of adults residing in Baltimore, MD, USA. Survey interviews were conducted with 1014 respondents ages 18–45 years between January 1997 and September 1998. According to the study protocol, only respondents between 18 and 35 years of age were asked to provide a urine specimen for gonorrhoea and Chlamydia trachomatis testing. Of the 728 respondents aged 18–35, 579 (80%) provided a urine sample, 16% refused, and 4% were not tested (owing to interviewer error, inability to provide a specimen, insufficient volume, etc).
For study purposes, respondents were considered positive if both an initial and a confirmatory LCR were positive. Repeat tests were unavailable for three cases who tested positive initially. These cases were classified as positive based on their first testing. All respondents were given a telephone number they could call to learn of their test results. Study staff used a succession of methods to attempt to contact participants who were confirmed positive (telephone, registered letter, and, if refused or undelivered, regular mail). Free, expedited treatment at one of the Baltimore City Department of Health clinics was offered to all contacted subjects with confirmed positive results.
Details of the full study design have been described elsewhere.12 Protocols were approved by the institutional review board at Research Triangle Institute and the Johns Hopkins University.
First LCR testing
Urine testing was performed following standard LCR procedures detailed by the manufacturer. (LCx-Abbott Laboratories, Abbott Park, IL, USA). For Neisseria gonorrhoeae, the optical density (OD) cut offs for the LCR products were: negative = <0.8; indeterminate: 0.8 <x<1.2; positive >1.2. For the C trachomatis assay, negative results were defined as less than 0.8, indeterminate: 0.8 <x<1.0, and positive as >1.0. All tests in the indeterminate range were retested to provide a “first test” result.
We retested 19 of 20 specimens that tested positive for chlamydial infection on their first LCR testing and 36 of 39 specimens that tested positive for gonococcal infection. Two specimens that initially tested positive for gonococcal infection and one specimen that initially tested positive for both gonococcal and chlamydial infections were not retested because the original specimen was not available. We defined a confirmed positive result as a specimen with positive results on both initial and repeat testing. Negative tests were not confirmed.
Theoretical expectations for positive predictive value
The positive predictive value, which reflects the post-test probability of disease for a positive test,17 is defined as:
PPV = probability (disease/test positive) = true positives/(true positives + false positives)
The positive predictive value can also be expressed in a form based on Bayes’s theorem17:
PPV = (sensitivity × prevalence)/((sensitivity × prevalence) + (1 − specificity) × (1 − prevalence))
This expression of PPV highlights the dependence of the PPV on prevalence and permits the prediction of PPV across a range of prevalences for a test with known sensitivity and specificity.
Previous work from our group has reported sensitivities of urine LCR testing in our laboratory of 88.6% to 95.5%, and specificity of 99% for women.18,19 Given the dependence of PPV on the prevalence and specificity, plots were constructed of the predicted PPV across a range of plausible values for prevalence (0.5%–10%) and specificity (95.0%–99.9%). The specificity estimates represent a range that we would expect when the tests are used in a variety of field settings outside the research laboratory. We examined PPV at a sensitivity of 95.0% and 0.88.6%, but the effect of sensitivity on PPV was small. These sensitivity estimates were used because they represent the lower bounds of LCR sensitivity for diagnosis from urine in earlier studies.
We compared the theoretically derived positive predictive value of a single LCR result with the empirically observed PPV, using a reference standard of “confirmed positive results” (that is, positive on both first and second testing).
Across a range of plausible prevalences and test characteristics, the theoretical expectation for PPV of a single LCR testing remained relatively low (fig 1). At 99.0% specificity and 95.0% sensitivity, the PPV was 66% if population prevalence was 2%, and it dropped to 32% if population prevalence was 0.5%. Even if specificity were increased to 99.9%, PPV at 2% prevalence was 95.1%, and at 0.5% prevalence it was 82.6%; a result that predicts nearly one in five positive test results will be false positive if true prevalence were as low as 0.5%. (Similar results are obtained if sensitivity is assumed to be 88.6% instead of 95%; see results in fig 1.)
In our field study, there were 579 eligible participants. Twenty (3.5%) subjects initially tested positive for chlamydial infection and 39 (6.7%) tested positive for gonococcal infection. Retesting was performed on 19 of 20 positive chlamydia specimens and 36 of 39 positive gonorrhoea specimens. (No retests were available for two specimens that initially tested positive for gonorrhoea and one specimen that tested positive for both gonorrhoea and chlamydial infection.) Retests yielded a second positive result in 17/19 (89.5%) cases of chlamydial infection and 30/36 (83.3%) gonorrhoea cases. These empirical results indicate that the positive predictive values of the first LCR testing in our study were 89.5% for chlamydial infection and 83.3% for gonorrhoea.
The unweighted prevalence of confirmed infections was 17/576 (3.0%) for chlamydial infection and 30/576 (5.2%) for gonococcal infections. (The unweighted sample counts represent the results of our NAAT testing, they do not, however, provide valid estimates of the prevalence of infection in the population as a whole or in any subpopulation. Since we used a complex sample design which purposely oversampled certain segments of the population (see Turner et al12), only the weighted estimates can be used to make inferences about the prevalence of NAAT detectable infections in the population.) Assuming specificities for a single LCR testing to be 99.0%–99.6%, the theoretically expected PPV for a single LCR testing for chlamydia would be 73%–87%, and for gonorrhoea it would be 82%–92%. This theoretical expectation is in agreement with our empirical results.
The availability of new, non-invasive NAATs (for example, LCR and polymerase chain reaction (PCR)) for STDs has resulted in recommendations to extend screening to venues of moderate and low prevalence such as emergency departments, primary care clinics, and others. Chlamydia testing is now recommended as part of routine care for sexually active women under 25 years of age. Using clinical tests to screen low prevalence populations presents new issues in the case of STD testing. A positive STD test result has implications for treatment, but it may also lead subjects to speculate about the source of their infection. This speculation could lead to social or psychological stress, especially if the individual is in a perceived monogamous relationship. Research and screening programmes that test low prevalence populations must reduce both the incidence of false positives and ameliorate their consequences.
Even under optimal conditions, unless the test specificity is 100%, clinicians will face the “epidemiological brick wall” described in figure 1 and table 1. These false positive results are predicted by Bayes’s theorem and are statistically unavoidable unless test specificity is 100%, which is seldom, if ever, attainable. Testing clinical specimens requires execution of a series of steps including specimen collection, transport, processing and detection, each of which is subject to error. In the BSBS survey, 36 field interviewers were employed to reach a population based sample of Baltimore City residents. Considering the complexity of the logistics, we were pleased with the diagnostic test’s performance on a single testing. The number of false positives that we observed in this study using repeat LCR testing were predictable based on test characteristics and the low prevalence of infection in the population.
Limitation of study
Our estimates of the PPV of LCR testing for gonococcal and chlamydial infection are limited by the absence of an independent “perfect” reference standard. We used as a reference standard a “confirmed positive result” based on initial and repeat LCR testing using the same specimen and the same test procedure. The use of the same test on the same specimen has the potential to misclassify some positive results, because a false positive test may be repeatedly false positive. Therefore, our empirical determination of the PPV for a single LCR testing might be viewed as the upper limit of the PPV for these tests because a small number of false positive results could go undetected with this testing scheme.
When clinical decisions are to be made based on test results from the screening of a low prevalence population, we believe that confirmatory algorithms are necessary. Repeat testing of positives will increase specificity and reduce the incidence of false positive test results. Ideally, a different, highly specific and sensitive test should be used as the confirmation. An initial testing with LCR might, for example, be confirmed by a repeat assay with PCR.
Since no testing protocol will completely eliminate the threat of false positives, it will also be important to inform subjects with confirmed positive test results of the possibility of test error. This is a tricky task since we do not wish to discourage subjects from seeking treatment. In the BSBS, we eventually settled upon a script that:
informed subjects that the tests are approved by the FDA and have been found to be very reliable, but there was always the possibility for error with any test; and
if the respondent had doubts about the test result, offered—at study expense—to collect another urine specimen and test it.
For all respondents except those who requested additional testing, we emphasised the importance of obtaining treatment and, as before, provided them with the address of the Baltimore public health clinic that would provide free, expedited treatment, if they did not have a private physician they wished to use.
The authors wish to thank Jeff Yuenger for his assistance in organising the laboratory testing and the survey operations staff of the Research Triangle Institute for their fielding of the survey.
Funding/support: Primary support for the Baltimore STD and Behavior Survey (BSBS) was provided by NIH grant R01-HD31067 to Dr Turner. Additional support was provided by the Research Triangle Institute and by grants R01-MH56318 to Dr Turner, K24-AI01633 and U19-AI38533 to Dr Zenilman. Dr Miller received support through the Clinical Associate Physician Program of the General Clinical Research Center (RR00046), Division of Research Resources, National Institutes of Health. Abbott Labs donated some of the LCR test kits used in this study.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.