A comparison of respondent-driven and venue-based sampling of female sex workers in Liuzhou, China
- Sharon S Weir1,
- M Giovanna Merli2,
- Jing Li3,
- Anisha D Gandhi1,
- William W Neely4,
- Jessie K Edwards1,
- Chirayath M Suchindran5,
- Gail E Henderson6,
- Xiang-Sheng Chen3
- 1The Carolina Population Center and the Department of Epidemiology in the Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- 2Sanford School of Public Policy, Duke Population Research Institute, and Duke Global Health Institute, Duke University, Durham, North Carolina, USA
- 3The National Center for STD Control and the Chinese Academy of Medical Sciences, Institute of Dermatology, Nanjing, Jiangsu, China
- 4Lake Forest Park, Washington, USA
- 5The Carolina Population Center and the Department of Biostatistics in the Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- 6Department of Social Medicine, The School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Correspondence to Dr Sharon S Weir, The Carolina Population Center and the Department of Epidemiology in the Gillings School of Global Public Health, Campus Box 8120, University of North Carolina at Chapel Hill, Chapel Hill, NC 27546, USA;
UNAIDS Report 2012 Guest Editors
Peter D Ghys
Geoff P Garnett
- Accepted 27 August 2012
Objectives To compare two methods for sampling female sex workers (FSWs) for bio-behavioural surveillance. We compared the populations of sex workers recruited by the venue-based Priorities for Local AIDS Control Efforts (PLACE) method and a concurrently implemented network-based sampling method, respondent-driven sampling (RDS), in Liuzhou, China.
Methods For the PLACE protocol, all female workers at a stratified random sample of venues identified as places where people meet new sexual partners were interviewed and tested for syphilis. Female workers who reported sex work in the past 4 weeks were categorised as FSWs. RDS used peer recruitment and chain referral to obtain a sample of FSWs. Data were collected between October 2009 and January 2010. We compared the socio-demographic characteristics and the percentage with a positive syphilis test of FSWs recruited by PLACE and RDS.
Results The prevalence of a positive syphilis test was 24% among FSWs recruited by PLACE and 8.5% among those recruited by RDS and tested (prevalence ratio 3.3; 95% CI 1.5 to 7.2). Socio-demographic characteristics (age, residence and monthly income) also varied by sampling method. PLACE recruited fewer FSWs than RDS (161 vs 583), was more labour-intensive and had difficulty gaining access to some venues. RDS was more likely to recruit from areas near the RDS office and from large low prevalence entertainment venues.
Conclusions Surveillance protocols using different sampling methods can obtain different estimates of prevalence and population characteristics. Venue-based and network-based methods each have strengths and limitations reflecting differences in design and assumptions. We recommend that more research be conducted on measuring bias in bio-behavioural surveillance.
In recognition of the importance of HIV/sexually transmitted infectionepidemics among sex workers, WHO recommends HIV and syphilis surveillance of sex workers.1 A common surveillance strategy is trend analysis of periodic cross-sectional bio-behavioural surveys.2 Methodological challenges arising from the hidden culture and illegal status of sex work make results difficult to interpret. These challenges include how to identify sex workers, how to sample the population and how to adjust crude estimates to account for differences in the probability of recruitment.
In this study, we compare two different methods to sample sex workers for the purpose of obtaining unbiased estimates of the characteristics of the population in a defined geographic area: a venue-based method named the Priorities for Local AIDS Control Efforts (PLACE) and a network-based method known as respondent-driven sampling (RDS). The methods were independently and concurrently implemented in Liuzhou prefecture (population 3.6 million) in Guangxi Province, China. The two strategies vary in assumptions, recruitment and analytic methods.
PLACE identifies and maps venues where people meet new sexual partners, selects a probability sample of venues and recruits participants from sampled venues. Data are analysed using statistical methods designed for complex surveys.3–6 The disadvantages of venue-based methods include the additional fieldwork required for mapping and visiting venues and the potential bias arising from missing non-venue-based sex workers. PLACE differs from other time–space sampling protocols7 used in surveillance of key populations in that venues are visited at peak times rather than randomly selected times; venue eligibility extends to any venue where people meet sexual partners, rather than only those with sex workers (or other target populations); and stratified sampling is often used to obtain estimates for more than one subgroup of interest. Target groups for surveillance, such as female sex workers (FSWs), are identified as a subgroup during analysis of PLACE data based on responses to survey questions (eg, did you exchange sex for money in the past 4 weeks?). The comparison with RDS presented here is based on the PLACE subsample of female venue workers who met the study definition of a sex worker.
RDS is a chain referral sampling method in which purposively selected ‘seeds’ initiate recruitment and invite others from their peer network for an interview conducted in a location selected for privacy, acceptability and convenience to participants.8–10 Chains of recruitment are documented, and recruitment continues until stopped by investigators. For sex worker studies, RDS assumes that sex workers are able to tell how many women they know who meet the eligibility criteria of the survey and to whom they would potentially be able to pass a coupon; that participants will recruit sex workers randomly from their network alters; and that the network consists of one connected component. Estimates account for different probabilities of selection arising from differences in the reported number of sex workers known by respondents. RDS has been used widely11 ,12 in studies of injecting drug users, sex workers and men who have sex with men.13–15 The disadvantages of RDS include the potential bias arising from non-random recruitment of network alters, from impersonation of eligible respondents in order to participate, from inability to inaccurately report of the number of eligible respondents known and from failure to include eligible respondents who are not socially networked. Recent computer simulations of RDS suggest that the variance in RDS estimates may be greater than previously expected.16
Study population, outcome measures and power calculation
Our primary comparison measure is the prevalence of a positive rapid test for syphilis among FSWs, defined as self-identified female subjects aged 15 or older living in Liuzhou prefecture who report exchanging sex for money in the previous 4 weeks. Assuming a design effect of two, we estimated that a sample size of 380 in each arm would have 80% power to detect an absolute difference of 5% in the estimated prevalence. We identified the geographic boundary of the study to include all of Liuzhou prefecture including the four urban districts of Liuzhou City and the six Liuzhou counties. The decision was based on formative research in Liuzhou that found that recruitment chains initiated in Liuzhou City would extend to Liuzhou counties and that precluding recruitment from Liuzhou counties would cut-off recruitment chains prematurely in the RDS arm.
The rapid test (Wantai anti-TP Antibody Rapid Test, Wantai Biological Pharmacy Enterprise, Beijing, China) measures the antibody response to a treponemal antigen using whole blood obtained from a finger prick and provides a result within 30 min. Participants testing positive were offered free supplemental testing with a non-treponemal test (TRUST, Rongsheng Biotechnical Company, Shanghai, China) and free treatment if indicated. RDS participants could initiate supplemental testing immediately in the RDS interview office; PLACE participants were referred to a clinic. The rapid test is convenient in outreach settings, holds value for participants and provides a biomarker comparison measure not subject to respondent reporting bias. The disadvantage of this test for surveillance purposes is that the treponemal antibody is a lifetime marker of infection and consequently does not distinguish between current and past infection.
A sampling frame of venues was compiled based on brief anonymous interviews with 400 community informants aged 18 and older. Community informants were asked to identify venues where people meet new sex partners, including but not limited to venues with sex work. Using strata defined by geographic area, type of venue and the number of informants reporting it, a stratified random sample of venues was identified, visited and characterised based on a face-to-face interview with a knowledgeable person onsite. We used these venue-level data to construct three strata of venues from which to sample workers for the PLACE RDS comparison: Stratum 1: Venues in urban districts where at least five female workers were sex workers and/or 50% of female venue workers were sex workers; Stratum 2: Other venues in urban districts; and Stratum 3: County venues. In order to reach our target of 380 sex workers, we oversampled venues from Stratum 1. All female workers at selected venues in Stratum 1 and Stratum 2 were eligible. In county venues, a maximum of five female workers were randomly selected and interviewed. (Female venue patrons were not eligible for the study because venue-level data and a pilot patron study found very few sex workers among female patrons. See also the online supplementary material.)
Established RDS methods8 ,9 were used to identify and recruit sex workers. The RDS protocol in Liuzhou was mainly based on an RDS protocol used in Shanghai among FSWs. Meetings were held in Liuzhou with people working with sex workers to adapt the Shanghai protocol to Liuzhou. Waves of recruitment were initiated by seven seeds selected by the study team from diverse sex work settings. RDS participants were first screened with questions to confirm their eligibility and prevent impersonation. They were then interviewed, tested for syphilis and instructed how to recruit up to two other eligible FSWs using uniquely coded coupons. Participants could drop in to the RDS office, which was centrally located in an urban district, or call for an appointment. Interviewers collected limited biometric data to prevent repeat participation. After the 15th wave, participants were restricted to one coupon17 to restrict sample growth. Network size was assessed using the question: ‘How many sex workers do you know in Liuzhou (including Liuzhou counties)? By knowing, I mean: you know their names and they know yours, and you have met or contacted them in the past month.’ Interviewers were trained to explain the eligibility criteria to participants, check the eligibility of each participant and to identify impersonators. When participants returned for their payments, they were asked information about the characteristics of those who refused a coupon.
Respondents provided verbal informed consent prior to participation. Surveys were administered face-to-face by trained interviewers in Mandarin Chinese or Zhuang. In the PLACE arm, study staff located private settings within venues for the interview and no unique identifiers were obtained. PLACE participants received ¥100 (about $14). An RDS participant received ¥100 initially and ¥50 for each recruit. The amounts of the incentives were determined locally. The Research Ethics Committee of the National Center for STD Control, China and the Institutional Review Boards at the University of North Carolina and Duke University approved the protocol.
In the PLACE arm, the sampling weight for each worker was the inverse of the probability of selection into the sample, taking into account the probability that the venue was selected and willing to participate and the probability that the respondent was selected. In the RDS arm, estimates were obtained using both the RDS Analysis Tool (RDSAT) V.6.0.118 and an RDS-II estimator.12 Only RDS-II estimates are presented here. (See online supplementary material for a description of the RDS-II methods including the bootstrap estimator for CIs and for comparison of RDS-II and RDSAT estimates.12 ,19) To compare the characteristics of sex workers recruited by each method, the Cochran–Mantel–Haenszel statistic was used in Proc Freq in SAS, using weighted frequencies and ignoring the design effect. To compare the proportion with a positive test, PLACE estimates and CIs were estimated using Proc Survey Freq in SAS20 to account for clustering, the design effect and probability of selection. RDS estimates and CIs used RDS-II bootstrapping methods, further described in the online supplementary material given for this report. For the multivariate analysis, we combined PLACE and RDS datasets and used binomial regression with generalised estimating equations to account for clustering by venue strata (PLACE) and seeds (RDS).
PLACE arm: worker and sex worker samples
Community informants identified 971 venues, in urban districts (67%) and in the counties (33%). Over half (53%) of the venues were named by two or more informants. The most common types of venues were massage parlours (24%), hair salons (12%) and karaoke clubs (11%), but parks, hotels, outdoor markets and streets were also named. Of the 971 venues named, 385 were selected for a venue visit and 64 venues were ultimately selected for worker interviews, including all 16 venues reporting significant sex work (stratum 1), 14 randomly selected venues in urban districts (from Stratum 2) and 23 randomly selected county venues (Stratum 3). Of the 64 venues selected, eight were not in operation when worker interviews were conducted (2–28 January 2010) and 11 venues refused to participate. Interviewers counted 806 female workers at the 45 participating venues and interviewed 680 female workers. Of the 126 female workers not interviewed, 36 worked at county venues where the five-worker limit had been reached and 58 worked at a large venue in an urban district and left before their interviews could be initiated. There were no direct refusals by female workers and no information was collected about the workers who were counted but not interviewed.
One-fourth of the female workers reported ever receiving cash or gifts in exchange for sex and 18.2% of the female workers (n=161) had done so in the last 4 weeks, thereby meeting the study definition for FSW. FSWs had a lower age at first sex, lower education, more arrests and more sexual partners than other female workers.
RDS arm: sex worker sample
RDS recruited 583 FSWs in Liuzhou between 26 October 2009 and 29 January 2010. Six of the seven seeds recruited additional participants, generating 9–20 recruitment waves. A total of 310 recruiting respondents recruited a mean of 1.9 participants, while most of the remaining 273 respondents were terminal nodes of the recruitment tree. Of those network alters approached, 29% did not accept the invitation to participate, mainly because of fear of being identified as a sex worker (75%). Of the 583 RDS participants, 47 (8.1%) refused syphilis testing, mostly (45/47, 95.7%) because they were recently tested.
Comparison of socio-demographic and behavioural characteristics
Socio-demographic characteristics of sex workers varied according to sampling method (table 1).
RDS estimated that 46% of FSWs lived in the district where they were interviewed, whereas PLACE estimated that half lived in the counties and only 9% lived in the urban district where the RDS office was located (figure 1A).
Compared with PLACE, RDS estimated that more FSWs were separated, divorced or widowed (24.0% vs 6.7%); that they had a higher mean monthly income (4888 renminbi vs 1994); were less likely to have solicited in counties (3.9% vs 61.4%) or outside Liuzhou (12.2% vs 31.1%); were more likely to have solicited by telephone or internet (31.6% vs 5.7%); and less likely to have been previously tested for syphilis (7.6% vs 35.2%) or HIV (28.9% vs 46.5%) in the past year. Characteristics that did not vary by sampling method included education and having 10 or more partners in the past 4 weeks (table 1).
Comparison of syphilis test results
PLACE estimated that almost three times as many FSWs had a positive syphilis test as RDS: 24% (95% CI 13.2 to 34.8) versus 8.0% (95% CI 5.9 to 13.0) (figure 1B). If those with missing test result data are excluded, the RDS estimate is 8.5%. Among FSWs younger than 25, the PLACE estimate was an order of magnitude higher (23.9% vs 2.8%). The prevalence among RDS FSWs age 15–24 (2.8%) was similar to the percentage of all female workers aged 15–24 at PLACE (6.3%), most of whom did not report sex work (data not shown). The 20 FSWs sampled by both RDS and PLACE methods were less likely to have a positive test than those reached by PLACE alone (1.8% vs 27.5%). Soliciting in Liuzhou counties and outdoors was associated with a positive syphilis test in both samples (table 2).
The estimated unadjusted prevalence difference of a positive rapid test comparing PLACE with RDS was 16.8%; the prevalence ratio was 3.3 (95% CI 1.5 to 7.2). After controlling for age and urban district/county residence, the prevalence ratio was 2.2 (1.2 to 3.9) (table 3).
Concurrent surveys of sex workers in Liuzhou, China, using different sampling methods found significantly different estimates of the prevalence of a biomarker of syphilis (24.0% vs 8.5%) and other characteristics. This is the first study to compare biomarker outcomes from concurrently implemented venue-based and RDS investigations of sex workers. Previous studies have compared findings from different time periods or from samples not designed to compare estimates.21–24
We expected the two protocols to obtain similar estimates. Without a gold standard measure, interpretation is difficult, although some insight is available from comparison with a 2005 study25 and exploratory analysis of possible explanations of the difference. The 2005 study of sex workers in urban districts of Liuzhou estimated the prevalence of syphilis to be 11% using a two-stage testing algorithm consisting of a rapid plasma regain (RPR) test followed by a Passive Particle Agglutination Test for those with a positive RPR test. We do not know what percentage of sex workers in the Lu study25 would have tested positive using the antibody rapid test used in our study, but it would have been higher than 11% as the rapid test would be positive for people previously infected as well as the 11% currently infected. The percentage of sex workers in Liuzhou City with a positive rapid test was higher than 11% for those recruited by PLACE (17.8%) and lower among those recruited by RDS (8.2%), although the difference is not statistically significant. Comparison with the 2005 study results is complicated by the time lag between studies, the different sampling strategies and the different socio-demographic profile of the 2005 study population. A more informative analysis would compare syphilis test results for young sex workers or sex workers working in similar venues.
PLACE could have overestimated prevalence if uninfected sex workers were missed because they were not venue-based, they denied sex work, they worked at refusing venues or if PLACE oversampled older women more likely to have a biomarker for lifetime exposure. We explored these possibilities. Given that only 7% of RDS FSWs reported exclusively recruiting non-venue-based clients in the past 6 months (data not shown), it seems unlikely that PLACE missed a large proportion of non-venue-based FSWs. We assumed that infection among workers at refusing venues was similar to prevalence at participating venues, but if not, PLACE could overestimate prevalence. Age composition is a less likely explanation as FSWs recruited by PLACE were younger than FSWs recruited by RDS.
Venue closings and refusal may have had a substantial effect on PLACE estimates. Significantly fewer FSWs were recruited by the PLACE protocol than the 380 FSWs expected. Obtaining a sufficient size sample is generally not a problem for venue-based sex worker surveys because protocols typically identify replacement venues to ensure targets are met. We did not anticipate that 19 of 64 venues would refuse participation or close prior to Spring Festival. Recruiting from additional venues during Spring Festival was not feasible.
The stigma and illegal status of sex work may have led to the denial of sex work and a reduction in the number of FSWs identified by PLACE. Because we interviewed and tested female workers at PLACE venues regardless of whether they reported sex work or not, we can estimate the percentage with a positive test among subgroups of workers most likely to include women actually engaged in sex work who deny it. The percentage of all female workers with a positive test (including women who reported one or no sexual partners in the past year) was 6.8%, a percentage not significantly different from the percentage among FSWs recruited by RDS (8.0%). If all PLACE workers who reported more than one sexual partner in the past year are assumed to be sex workers (an extreme and untenable assumption), the point estimate of PLACE FSW with a positive rapid test would still be twice as high as the RDS estimate (17.8% vs 8.0%). If all female workers at hair salons and massage parlours (two types of venues that are often fronts for commercial sex in China) are assumed to be sex workers, the point estimate would decrease from 24% to 14.9%.
Several scenarios could result in the RDS estimate being too low. Underestimates could arise if: (1) infected subgroups were not linked through the peer network; (2) if the interview location was less accessible to those infected; (3) if participants who refused testing were more likely to be infected than those who agreed to testing; or (4) if a large number of respondents did not meet the eligibility criteria (eg, because somehow screening methods were not effective or definitions were not clear, or the incentives attracted people who were not eligible). Underestimates could also arise if sampling weights were biased due to: inaccurate reports of network size, preferential recruitment of uninfected persons and/ or larger networks among infected individuals (leading to down-weighting of infected individuals). We explored these possibilities in a limited way.
RDS recruited few sex workers from Liuzhou counties. The finding that RDS missed geographic pockets of sex workers has been previously reported,24 and in hindsight, establishing RDS offices in the counties may have increased participation from the counties, albeit at the risk of significantly increasing costs and introducing the complication that different recruitment sites may not recruit from the same network. Travel time to the RDS office could exceed 3 h and there were few cross-cutting ties evident in the recruitment chains between urban district and county respondents. If the comparison were limited to sex workers in urban districts, the prevalence ratio would drop from 3.3 to 2.2 (17.8% vs 8.2%).
It is possible that some other subgroups of FSWs with higher prevalence of infection were missed by RDS. Only two of six recruitment chains had more than two infections (See figure 2). Three chains with two or fewer infections primarily recruited from karaoke bars or karaoke TV. Because the RDS assumption of non-preferential recruitment constrains study managers from guiding the referral process toward members of the population who are likely to be missed, it is possible for recruitment chains to become trapped in low or high prevalence networks. There is also some indication that the RDS prevalence estimate would have been higher if all RDS recruits had agreed to be tested. For example, nine of the 47 (19%) who refused testing volunteered that they had previously tested positive.
Another possible explanation for the difference is that the PLACE sample captured women who were more frequently engaged in sex work whereas the RDS sample recruited people who were at lower risk because they less frequently engaged in sex work. It is difficult to fully assess the risk profiles of each group without information on the level of infection among clients, but there was no difference in the number of partners reported by PLACE versus RDS participants.
This study illustrates the challenges of surveillance among hidden populations. Two different sampling methods resulted in significantly different characterisations of the same target population. We focused on syphilis, but the findings are relevant to other sexually transmitted infections and relevant sexual risk behaviours as well. Our study confirms that countries should exert caution in selecting or changing surveillance methods2 and illustrates the shift in estimated prevalence that can arise with a change in sampling methods.
Concurrent implementation afforded insights into each method. We recommend that surveillance activities routinely include investigation of bias. For venue-based methods, the proportion of non-venue-based sex workers should be estimated. The characteristics of venues that refuse and substituted venues and reasons for refusal should be analysed to assess sample representativeness. Venue-based studies may also want to assess bias arising from denial of sex work, possibly through a longer survey or an indepth interview of a subset of workers who initially deny sex work. Although obtaining information on the 519 female workers who were not sex workers allowed useful exploration of survey bias and important information on another group at risk of infection, the PLACE method was not as efficient in obtaining a large sample of sex workers as other venue-based methods that screen out non-sex workers from the survey. RDS studies would also benefit from routine investigation of key assumptions. Insight on recruitment bias and its impact on the RDS estimates can be gained by obtaining information on the characteristics of people in a participant's network, including network alters who were not invited to participate.26 Insight on these types of bias can also be gained from mapping the work location of those recruited to identify whether any geographic pockets of the population are missed.
Concurrently implemented surveillance protocols using different sampling methods can obtain different estimates of prevalence and population characteristics.
Venue-based and network-based methods each have strengths and limitations reflecting differences in design and assumptions.
We recommend that more research be conducted on measuring bias in bio-behavioural surveillance estimates.
We thank JL for her excellent work as study coordinator, and the physicians and the outreach workers in the study area for their hard work. We acknowledge the contribution of Dr Myron Cohen. We also acknowledge Ms Feng Tian of Duke University and Ms Katharine L McFadden of the University of North Carolina for their assistance.
Contributors SSW: Overall PI for the study, responsible for implementation of the venue-based arm, and primary writer of the manuscript. GM: Co-PI for the study with primary responsibility for the RDS arm, contributed to text of paper, and overall analysis. X-SC: Co-PI responsible for oversight of syphilis testing, field work, and implementation, contributed to interpretation of results and analysis. ADG: Responsible for RDSAT analysis and review of paper. WWN: Responsible for RDS-II CIs and review of paper. JKE: Responsible for multivariable analysis and review of paper. CMS: Responsible for overall technical oversight of all statistical issues and review of paper. JL: Responsible for interviewer training, day to day coordination of field work, data quality, review of manuscript, resolving data issues and data entry. GEH: Responsible for identification of study location, facilitating collaboration with people in China and review of manuscript.
Funding Funding for the China PLACE-RDS Comparison Study was provided by USAID under the terms of cooperative agreements GPO-A-00-03-00003-00 and GPO-A-00-09-00003-0; by NICHD through the UNC R24 ‘Partnership for Social Science Research on HIV/AIDS in China’ (R24 HD056670-01) and the Pre-Doctoral Training Programme at the Carolina Population Center (R24 HD50924), University of North Carolina; by UNICEF/UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (TDR) through the ‘WHO Rapid Syphilis Test Project (A70577)’; by the Duke University and University of North Carolina Center(s) for AIDS Research; and by the National Center for STD Control in China.
Competing interest The authors have no conflict of interest to declare. The opinions expressed are those of the authors and do not necessarily reflect the views of any government.
Ethics approval Ethics approval provided by the IRB at UNC-Chapel Hill, IRB at National Center for STD Control in China.
Provenance and peer review Commissioned; externally peer reviewed.
Open Access This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/3.0/ and http://creativecommons.org/licenses/by-nc/3.0/legalcode