Respondent-driven sampling is a network-based technique for estimating traits in hard-to-reach populations, for example, the prevalence of HIV among drug injectors. In recent years RDS has been used in more than 120 studies in more than 20 countries, and by leading public health organisations, including the Centers for Disease Control and Prevention in the USA. Despite the widespread use and growing popularity of RDS, there has been little empirical validation of the methodology. In this talk, I investigate the performance of RDS by simulating sampling from 85 known, network populations. Across a variety of traits we find that RDS is substantially less accurate than generally acknowledged, and that reported RDS CI are misleadingly narrow. Moreover, it is unlikely RDS performs any better in practice than in our simulations as we model a best-case scenario in which the theoretical RDS sampling assumptions hold exactly. Notably, the poor performance of RDS is driven not by the bias, but by the high variance of estimates, a possibility that had been largely overlooked in the RDS literature. Given the consistency of our results across networks and our generous sampling conditions, we conclude that RDS as currently practiced may not be suitable for key aspects of public health surveillance where it is now extensively applied. This work is joint with Matthew Salganik.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.