Article Text


S13 Respondent-driven sampling: where we are and where should we be going?
S13.3 An empirical evaluation of respondent-driven sampling
  1. N McCreesh1,2,
  2. S Frost3,
  3. J Seeley1,2,4,5,
  4. J Katongole4,
  5. M Ndagire Tarsh4,
  6. R Ndungutse4,
  7. F Jichi1,2,
  8. D Maher4,1,2,
  9. P Sonnenberg6,
  10. A Copas6,
  11. R J Hayes1,2,
  12. R G White1,2
  1. 1Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, UK
  2. 2Faculty of Epidemiology & Population Health, London School of Hygiene and Tropical Medicine, UK
  3. 3Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
  4. 4MRC/UVRI Uganda Research Unit on AIDS, Entebbe, Uganda
  5. 5School of International Development, University of East Anglia, Norwich, UK
  6. 6Department of Infection and Population Health, UCL, UK


Objective Respondent-driven sampling (RDS) is an increasingly widely used variant of snowball sampling, that proponents claim can provide unbiased estimates. RDS has not been rigorously evaluated in the field. This study evaluated RDS by comparing estimates from an RDS survey with total-population data.

Methods Total-population data on age, tribe, religion, socioeconomic status, sexual activity and HIV status were available on a population of 2402 male household-heads from an open cohort in rural Uganda. An RDS survey was carried out in this population, employing current RDS methods of sampling (RDS-sample) and statistical inference (RDS-estimates). Analyses were repeated for the full RDS sample and a small sample of the first 250 recruits (including 10 seeds).

Results 927 household-heads were recruited (including 10 seeds). Full and small RDS-samples were largely representative of the total-population for most variables, but under-represented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. RDS statistical inference methods failed to reduce these biases. Only 31–37% (depending on method and sample size) of RDS-estimates were closer to the true population proportions than the RDS-sample proportions. Only 50–74% of RDS bootstrap 95% CIs included the population proportion.

Conclusions RDS produced a generally representative sample of this well-connected non-hidden population. However, current RDS inference methods failed to reduce bias when it occurred. Whether RDS can collect the data required to reliably remove bias and measure precision during analysis is unresolved. As such, although RDS may be a feasible and cost-effective method for sampling hidden or hard-to-reach populations, RDS should still be regarded as a (potentially superior) form of convenience sample, and caution is required when interpreting findings from RDS studies.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.