Article Text

Download PDFPDF

How many men who have sex with men and female sex workers live in El Salvador? Using respondent-driven sampling and capture–recapture to estimate population sizes
  1. G Paz-Bailey1,2,3,
  2. J O Jacobson4,
  3. M E Guardado2,
  4. F M Hernandez2,
  5. A I Nieto5,
  6. M Estrada6,
  7. J Creswell7
  1. 1Del Valle University of Guatemala, Guatemala City, Guatemala
  2. 2Tephinet Inc., Atlanta, Georgia, USA
  3. 3University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
  4. 4Pan American Health Organization, Andean Region, Bogota, Colombia
  5. 5Ministry of Health, San Salvador, El Salvador, Colombia
  6. 6United States Agency for International Development, San Salvador, El Salvador, Colombia
  7. 7World Health Organization, Geneva, Switzerland
  1. Correspondence to Dr Gabriela Paz-Bailey, Del Valle University of Guatemala, 18 avenida 11-42 zona 15, Vista Hermosa III, Guatemala City 01015, Guatemala; gpaz{at}


Objective To estimate the numbers of female sex workers (FSW) and men who have sex with men (MSM) in San Salvador, El Salvador.

Design and methods A capture–recapture exercise was conducted among MSM and FSW in San Salvador in 2008. The first capture was done by distributing key chains to both MSM and FSW populations through local non-governmental organizations (NGO) that work with these groups. The second capture was done during the course of an integrated behavioural and biological survey (IBBS) using respondent-driven sampling (RDS). The proportion receiving a key chain estimated from the IBBS study was adjusted by RDS-derived weights.

Results The first capture included 400 FSW and 400 MSM. Of the 624 MSM interviewed in the IBBS, 36 (5.8% crude; 3.2% adjusted RDS) had received the key chain. The estimated population size of MSM in San Salvador was 12 480 (95% CI 7235 to 17 725). Of the 663 FSW interviewed in the IBBS, 39 (5.9% crude; 6.9% adjusted RDS) had received the key chain. The estimated number of FSW was 5765 (95% CI 4253 to 7277).

Conclusions The capture–recapture exercise was successfully linked to an IBBS to obtain city-level population sizes for MSM and FSW, providing valuable information at a low cost. Size estimates are crucial for programme planning for national AIDS programmes, NGOs and stakeholders working with these populations and for HIV projection models.

  • El Salvador
  • developing world
  • epidemiology
  • HIV
  • homosexual
  • MSM
  • population size estimates
  • RDS
  • respondent driven sampling
  • sex workers
  • sexual behaviour

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

In central America, populations most at risk of HIV include men who have sex with men (MSM) and female sex workers (FSW). For MSM, HIV prevalence estimates range from 7.6% in Nicaragua to 15.3% in El Salvador, and among FSW from 0.2% in Nicaragua to 5.5% in Honduras.1 2

Despite the epidemiological evidence available, MSM appear to be a neglected group. In El Salvador, less than 3% of HIV prevention funds are dedicated to MSM3 and there are few HIV prevention interventions systematically implemented targeting this group. Estimation of the size of the populations most at risk of HIV infection is crucial for structuring the response to the epidemic. Such estimates form the foundation of evidence-based planning for resource allocation, service delivery, targeting of effective interventions and, potentially, monitoring impact.4 However, there have been few attempts in central America to estimate MSM or FSW population sizes.5

Many methods are suggested to estimate population sizes, each with strengths and weaknesses.4 6 Capture–recapture methods are commonly used to count wildlife populations. This method begins with a capture stage, when as many individuals as possible in a fixed geographical area are captured, tagged and released back into the population. A second sample of individuals is then captured independently, of which some number will have been marked previously. If the second sample is representative of the population as a whole, the proportion of marked individuals ‘recaptured’ provides an estimate of the proportion of the whole population in the sample.7 We used capture–recapture methods linked to an integrated behavioural and biological survey (IBBS) among MSM and FSW in San Salvador to estimate the size of the MSM and FSW populations in San Salvador in 2008.


The study was conducted in the city of San Salvador, El Salvador, from May to September 2008. FSW were defined as women 18 years of age or older who reported exchanging sex for money in the past month and lived and/or worked in the city. MSM were defined as men 18 years of age or older, who had anal sex with another man in the past month and lived or worked in the city. For the capture–recapture, the first sample was obtained through the distribution of a set number of unique objects (key chains) to members of the target population. The unique objects were non-randomly distributed as widely as possible through outreach activities with peer educators and staff from five non-governmental organizations (NGO) during a period of 2 weeks in May 2008. Peer educators enquired about the inclusion criteria before distributing the unique object to the individual and verified they had not received a key chain from a different educator. Each individual from the target population contacted received exactly one object and was asked to keep it because he or she might be asked about it in the near future by another project staff member. Peer educators and other staff distributing the unique objects recorded the times and locations of distribution and the number of objects distributed, using standardised log sheets. These data comprise the first capture.

Locations for distribution of the unique objects were selected from a previous mapping exercise in El Salvador, which identified 150 public places where MSM gather and 495 locations where FSW gather or work. Because of constraints on study resources, a subset of 33 of the mapped locations were selected for inclusion in the first capture based on the following principles: sites selected for the distribution of the unique object should have large capacity and plenty of clients or visitors and the selected places should include as many types of venues as possible (eg, parks, streets, bars, saunas, movie theatres). For MSM, 24 places were visited, including three shopping malls, two movie theatres, five parks, five bars, three public markets, four streets and two NGOs. For FSW, nine places were visited including one beauty salon, one modelling agency, one bar, one gas station, three brothels and two parks.

The IBBS, Encuesta Centroamericana de Vigilancia de VIH y Comportamiento en Poblaciones Vulnerables (ECVC), was used as the second sample and is described in greater detail elsewhere.8 Briefly, the survey was designed to provide representative estimates of the MSM and FSW population for each city surveyed. ECVC used respondent-driven sampling (RDS) for the recruitment of participants. The questionnaire captured data about receiving the unique object in the weeks before the study. To minimise recall bias, inmigration and outmigration from the population, the unique objects were distributed shortly before RDS implementation. To estimate population proportions, participants were asked if they had received the unique object and to show or describe the object to the interviewer. Those who were in possession of the object or could recall it in sufficient detail were included in the estimate of overlap between the two captures. Calculations were then made to estimate the population size. All persons recruited through RDS were screened to determine whether they were eligible to participate in the study.

The number of unique objects to be distributed was agreed upon by the project coordinator and organisations assisting in distribution. While increasing the number of unique objects may improve the precision of the final estimate, there is no set formula to determine a specific number for distribution. Factors related to the distribution of unique objects, such as logistics and human resources, were considered. The number of unique objects was determined by the number of contacts the distributing outreach organisations anticipated reaching within 1–2 weeks. It was estimated that 400 key chains for MSM and 400 for FSW could be distributed in 2 weeks in San Salvador. The sample sizes for the IBBS study were 600 FSW and 600 MSM based on expected HIV prevalence and the ability to detect a change on a given behavioural indicator from the baseline survey to a follow-up study done in 4 years time.8 The ethical review board of the Rosales National Hospital (RNH) in San Salvador reviewed and approved the study protocol. The study was also reviewed by the Centers for Disease Control and Prevention—Global AIDS Programme Associate Director for Science, who delegated approval to RNH.

Data analysis

The population size estimate was based on two factors: the number of individuals ‘captured’ in the first sample, which was fixed by the study; and the number of participants in the IBBS study who had received a unique object, a random variable. The estimation of the population size and variance was done using both the crude and RDS-adjusted number of individuals receiving the unique object. The final estimates and CI are based on the estimated adjusted numbers.

The capture–recapture formula for estimating the population size was obtained from Thompson:7N^=n11p^=n1(n2u)where N^ is the estimate of total population size, n1 is the number of individuals in the initial sample, n2 is the number of individuals in the second sample, u is the number of individuals in the second sample who received the unique object and p^ is the estimated proportion of the ‘recapture’.

Using the following formula6 we estimated the variance of the population size estimate and 95% confidence bounds around our estimate:Var(N^)=n1n2(n1u)(n2u)u3where Var(N^) is the variance of the population size estimate.


MSM size estimation in San Salvador

A total of 400 key chains was distributed among MSM. The IBBS had a total of 624 participants, 36 of whom reported having received the object. The unadjusted proportion of individuals receiving the item was 0.058 for an estimated 6933 MSM in the city of San Salvador (95% CI 4836 to 9031). The RDS-adjusted proportion was 0.032 (n=20) for an estimated 12 480 MSM (95% CI 7235 to 17 725).

FSW size estimation in San Salvador

For the FSW estimate, a total of 400 key chains was distributed, 663 individuals participated in the IBBS and of these 39 reported having received the object. The unadjusted proportion of individuals receiving the item was 0.059, which corresponded to an estimated 6800 FSW in the city of San Salvador (95% CI 4833 to 8767). The RDS-adjusted proportion that received the object was 0.069 (n=46) for an estimated 5765 (95% CI 4253 to 7277).


This study reports on a feasible method to estimate population sizes linking capture–recapture methods to RDS studies that provide representative estimates of the networked population. The difference between the crude and RDS-adjusted proportions in the population estimates for both MSM and FSW highlights the importance of having a representative study as the second data source. As IBBS become more common in countries with concentrated HIV epidemics and use more robust designs to capture representative data, countries should consider including population estimates as part of these activities. The low cost and ease of data collection is all the more reason to conduct these estimations. Due to the close cooperation with NGO working with MSM and FSW in San Salvador the incremental cost to the IBBS for the initial capture was US$500 for each population.

An international meta-analysis of population-based studies that have enquired about male sexual behaviours reported a prevalence of lifetime sex between men in Latin America that varied from 2% to 25%. Sex between men in the past year varied from 1% to 8%.9 A similar study reported the proportion of sex workers among the adult female population in urban centres in Latin America varying from 0.2% to 7%.10 As part of El Salvador's UNAIDS-supported HIV estimation exercise in 2009—which all countries develop biannually—the national estimation group estimated that MSM constitute 2–5% of the urban male population and FSW 0.4–0.8% of the urban female population 15–49 years of age, based on UNAIDS recommendations and available literature. These figures imply a total of 7267–18 167 MSM and 1627–3253 FSW in San Salvador. These figures are very close to our findings for the MSM population. However for FSW, our numbers are almost double the UNAIDS-supported figures and our findings must be interpreted with caution. Based on our study and using the 2009 census urban adult population aged 15–49 years for San Salvador, 3.4% of men in San Salvador are MSM and 1.4% of women are FSW.11

Several assumptions must be met in order to generate a reliable estimate using the capture–recapture. Most important is that the samples should be independent. In order to ensure independence, different sampling strategies were used at each stage; furthermore, provided the RDS estimate is representative of the population, participation in the IBBS is independent of belonging to more visible groups in which the first capture was conducted. The second assumption is that all members in the target population should share equal chance to be selected or that at least the second sample is representative of the population.6 7 The requirement of a simple random sample is unrealistic for research with hidden populations, such as MSM, in which it is challenging to construct a sampling frame. In our study, RDS was used to recruit participants in the second capture; weighted estimates generated from RDS data are considered to be unbiased estimates that are generalisable to the population as a whole. While not providing equal selection probabilities, the RDS-adjusted point estimates are representative of the population, and the adjusted variance is reflected in the CI. The last assumption is that the population is closed. The time between the two phases for this study was short and we believe that in and outmigration was negligible.

We found that using the crude and adjusted estimates on the proportion of people receiving the key chain introduced differences in the results for MSM. It is likely that key chains were preferentially distributed to MSM with larger social networks; thus the recapture fraction is downweighted when applying the RDS adjustment and the estimated number of MSM increases in turn. Using the crude proportion would therefore underestimate the population by failing to correct for dependence between the first and second captures. This dependence occurs because members of the target populations who are more accessible for the key chain distribution would also be more likely to be recruited into the RDS study (because of their relatively large network sizes they have a greater possibility of being invited to participate). For FSW, the differences between the crude and adjusted estimates were not large. Notably, using the adjusted estimates for FSW resulted in a reduction in sampling error in contrast to the increase in error seen for MSM. Interestingly, the homophily score (ie, tendency to recruit within subgroup) among FSW who received the key chain was lower (0.08) than the score among those who did not (0.30), while for MSM the relationship was the reverse (0.13 compared with −0.02, respectively). While not the focus of this work, one possible explanation is that distribution of the unique object to individuals who comprise less-connected versus more-connected subgroups may improve sampling error and should be investigated further.

It is believed that weighted estimates generated from RDS generalise to the population as a whole. However, recent assessments of the RDS sampling methodology have called into question the validity of this assertion and of conventional means of analysing RDS data, given the dependence of RDS on arguably unrealistic assumptions. Computer simulations using US network data suggest that variance may be underestimated, particularly in the presence of tightly linked subgroups that are poorly connected to the rest of the social network (ie, ‘bottlenecks’).12 In this study, for variables such as age, place of work, self-identification and educational level, we observed relatively large numbers of recruitments from each variable category to remaining categories (ie, no zero cells in the recruitment matrix), suggesting that network bottlenecks were not a major concern with respect to these variables (data not shown). Evaluations of the RDS recruitment strategy also suggest that bias may be introduced if too few waves are included or if recruitment is preferential.13 In our study, of the 11 seeds in the RDS study, six resulted in long chains reaching at least five waves of recruitment. The longest recruitment chain included 18 waves. Finally, in RDS, estimates are based on a with-replacement random-walk model, while the actual sampling is without replacement. When a substantial fraction of the target population is sampled this approximation can lead to bias in the resulting estimators.13 Based on our results, the RDS study sample represented less than 5–12% of the estimated population.

This study has several limitations inherent to RDS including those mentioned above. In addition, as in any study of highly stigmatised populations it is possible that members of the MSM and FSW populations may have chosen not to participate for fear of being identified. There have been no other sources of size estimation such as population-based studies in El Salvador to validate these results. Other methods, such as including specific questions in national demographic health surveys, would be useful to triangulate results and improve the estimations. In behavioural surveillance there is always a high potential for social desirability response. Moreover, most methods of size estimation inevitably experience biases that arguably lead to under or overestimation, but such that researchers rarely know which way the biases may go. Therefore, data from a ‘gold standard’ survey in terms of sampling methodology and representativeness, for which there is also wide agreement that the estimates are likely to be conservative, provides a useful lower bound on which all can agree, as a first step forward.

It has been known for many years that the MSM and FSW populations are at greater risk of HIV in El Salvador. Our estimates suggest the size of these populations in San Salvador is considerable and should be targeted for interventions. It is essential that countries with concentrated epidemics have up-to-date estimates of most at-risk populations. Such data will allow the El Salvador Government, national and international agencies and other partners to prepare and budget interventions for these populations.

Key messages

  • This study reports on a feasible method to estimate population sizes by linking capture–recapture to an HIV prevalence and behavioural study using RDS.

  • The difference between the crude and adjusted estimates highlights the importance of having a representative study as the second data source for size estimation using capture–recapture.

  • As behavioural surveys in hidden populations become more common and use more robust designs to capture representative data, they should incorporate an estimation of population size in addition to other study objectives.


The authors would like to thank Keith Sabin from the WHO and Willi Macfarland from University of California, San Francisco, for their useful comments to the manuscript. The authors express their gratitude to the government of El Salvador and the National HIV/AIDS and STI Programme for leading this study. Appreciation is due to the Panamerican Social Marketing Organization (PASMO) for its collaboration in managing the ECVC study and facilitating the outreach work of this effort, especially Susan Padilla, Gerardo Lara, Manuel Beltrán, Moisés Marinero, Ricardo Hernandez, Edwin Hernandez, Maripaz Callejas, Aracely Corado, Esmeralda Flores, Carlos Escobar, Norma Miranda, Nancy Gonzales and Saúl Hernández. The authors would especially like to thank the field staff and the study participants who made this survey possible.



  • Linked article 049171.

  • Portions of this study were presented at ISSTDR, London, 28 June–1 July 2009 2007 and the International AIDS Conference, Vienna, 18–23 July 2010.

  • Funding Funding for this study was provided by the United States Agency for International Development (USAID) in El Salvador and the Centers for Disease Control and Prevention (CDC).

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval This study was conducted with the approval of the ethical review board of the Rosales National Hospital (RNH) in San Salvador. The study was also reviewed by the Centers for Disease Control and Prevention–Global AIDS Program Associate Director for Science, who delegated approval to RNH.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles