Article Text


Surveillance and modelling of HIV, STI, and risk behaviours in concentrated HIV epidemics
  1. S Mills1,
  2. T Saidel2,
  3. R Magnani3,
  4. T Brown4
  1. 1Family Health International, Hanoi, Vietnam, and the Department of Epidemiology and Population Health, Centre for Population Studies, London School of Hygiene and Tropical Medicine, London, UK
  2. 2Family Health International, Institute for HIV/AIDS, Arlington, VA, USA
  3. 3Family Health International, Institute for HIV/AIDS, Arlington, VA, USA
  4. 4East-West Center, Bangkok, Thailand
  1. Correspondence to:
 Mr S Mills


Background: HIV epidemics in most countries are highly concentrated among population subgroups such as female sex workers, injecting drug users, men who have sex with men, mobile populations, and their sexual partners. The perception that they are important only when they cause epidemic expansion to general populations has obscured a critical lack of coverage of preventive interventions in these groups, as well as appropriate methods for monitoring epidemic and behavioural risk trends. The difficulties in accessing such groups have likewise often cast doubt on the representativeness of observed disease and behavioural risk estimates and their validity and reliability, particularly those related to sampling and the measurement of risk behaviours.

Objectives: To review methodological obstacles in conducting surveillance with population subgroups in concentrated HIV epidemics, elaborate on recent advancements that partially overcome these obstacles, and illustrate the importance of modelling integrated HIV, STI, and behavioural surveillance data.

Methods: Review of published HIV, STI, and behavioural surveillance data, research on epidemic dynamics, and case studies from selected countries.

Conclusions: The population subgroups that merit regular and systematic surveillance in concentrated epidemics are best determined through extensive assessment and careful definition. Adherence to recently refined chain referral and time location sampling methods can help to ensure more representative samples. Finally, because of the inherent limitations of cross sectional surveys in understanding associations between complex sexual behaviours and HIV and STI transmission, mathematical models using multiple year data offer opportunities for integrated analysis of behavioural change and HIV/STI trends.

  • ANC, antenatal clinic
  • FSW, female sex worker
  • IDU, injecting drug user
  • MSM, men who have sex with men
  • RDS, respondent driven sampling
  • HIV
  • AIDS
  • risk behaviours
  • surveillance

Statistics from

After over 20 years of the HIV pandemic, the levels of current infection exhibit extraordinary variation between countries and regions. Whereas countries in sub-Saharan Africa have an estimated overall HIV prevalence among adults of 8.8% (the four countries of Botswana, Lesotho, Swaziland, and Zimbabwe all report over 30%), more than 70% of world’s countries report lower than 1% of their adult population infected with HIV.1

HIV transmission in many of the high prevalence countries of Africa is related to high levels of sexual partner turnover among both males and females leading to a pattern of “generalised” heterosexual spread of HIV. In most other parts of the world, HIV is concentrated among subpopulations with high risk behaviours (people in networks with high sexual partner turnover or substantial needle sharing) and their long term sexual partners.2–4 Examples of such “concentrated” or emerging epidemics exist throughout Asia, Latin America, Europe, and North America.5–8 When it is concentrated among these populations, HIV transmission is thwarted most efficiently by intensive and focused interventions, and countries that have mounted such programmes have achieved clear evidence of reduced prevalence and incidence. For instance, in Thailand and Cambodia interventions targeted at female sex workers (FSWs) and their clients have resulted in dramatic increases in condom use, decreases in commercial sex, and subsequent reductions in HIV and STI transmission.9,10

Unfortunately, examples where interventions have not been focused or early enough are more plentiful. Several countries in Eastern Europe and Central Asia, for example, have rapidly expanding epidemics in their injecting drug using populations.5,11 Parts of China and India, the two most populous countries in the world, also report expanding epidemics—primarily fuelled by injection drug use, commercial sex partnerships, and highly mobile populations.8,12,13

The cornerstone of a country’s HIV response should be its surveillance of HIV, STIs, and behavioural risk factors.14–16 Unfortunately, the surveillance systems of most countries, including both formalised government surveillance as well as ad hoc surveys, are still insufficient to adequately monitor epidemic and risk trends. For example, the high levels of HIV found among injecting drug users (IDUs) in Eastern Europe are clear evidence of an epidemic that started years earlier, but which was not adequately monitored in time for an appropriate prevention response.11

Providing the most appropriate biological and behavioural surveillance in countries where HIV is low or concentrated raises challenging methodological issues because of difficulties in defining and gaining access to the major population subgroups affected. These problems often cast doubt on the representativeness of observed estimates and the validity and reliability of self-reported behaviour. In this paper, we highlight several of these methodological obstacles, elaborate on recent advancements that partially overcome them, and finally illustrate the important use of modelling integrated surveillance data.


Recognising that a standardised approach to surveillance does not do justice to the diversity of different epidemic patterns nor inform appropriate responses, WHO and UNAIDS introduced the concept of second generation surveillance in the late 1990s.17,18 Its key innovation was to stress varying surveillance designs for countries with different epidemic settings. Together with proxy measurements from antenatal clinic (ANC) attendees, the Demographic and Health Survey, and other ad hoc nationally representative surveys, many of which now include biological measurements of HIV and STI, provide important estimates at the general population and national levels.19–21

However, in settings where risk behaviours are concentrated in a small proportion of the total population, these surveys do not detect risk efficiently nor monitor behavioural risk changes appropriately over time. In such settings, surveillance needs to focus on those population subgroups with the highest levels of risk so that changes in HIV, STI, and behaviours can be effectively monitored over time.3,16

The population subgroups that tend to fuel epidemics in concentrated and low grade epidemics are FSWs, IDUs, men who have sex with men (MSM) (including those who sell sex), and mobile populations. As these groups are often hidden, stigmatised, or, perhaps, not recognised even to exist by their governments, accessing them for surveillance purposes, constructing representative sampling frames, and obtaining reliable estimates of HIV, STI, and behavioural risks pose considerable challenges. These challenges are discussed in the following sections.


The main function of surveillance is to help provide an understanding of local epidemics, including sources of new infections over time, and the behavioural and biological factors driving epidemic spread. Thus, the focus of surveillance should be on population subgroups that are useful in monitoring and tracking the current and future course of the epidemic. Even where the prevalence of HIV is high, population subgroups may be identified that disproportionately contribute to epidemic spread. Research in both Africa and Asia has suggested that even in settings where HIV prevalence in the population at large has reached 2% or 3% (a so called “generalised epidemic”), most new transmission is still attributed to those with higher risk behaviours and their immediate sexual partners. A failure to focus prevention resources on high risk subpopulations results in far greater numbers of infections in the general population.22,23

The interlinked nature of these high risk subpopulations with the general population ensures that epidemics among population subgroups such as IDUs can quickly contribute to growth of heterosexual spread through sex work. Research has shown that, although the greatest number of infections can be averted when interventions are focused on high risk subpopulations such as IDUs before extensive spread of infections into the general population, IDUs continue to be the source of a disproportionately high number of infections even in advanced stages of these epidemics.23 Research and modelling of data from Cotonou, Benin (a setting with approximately 2.5% ANC prevalence at the time of the study in 1999), has shown that most of the HIV infection among the general population of women there is attributable to transmission between sex workers and their clients.22

Careful consideration of the definitions of population groups is important because of diversity within each group. For example, sex workers are not a homogenous group but comprise women who work out of a variety of institutional and non-institutional settings. Because of their importance to epidemic spread, subgroups of sex workers may merit inclusion in a country’s surveillance. Thailand, for example, tracks both HIV and risk behaviours in two types of sex workers in the country—brothel based and non-brothel based—because these groups differ sharply in terms of risk (frequency of sexual contact, levels of other STIs, levels of protective behaviour), vulnerability, and appropriate intervention approaches.24 The variability in these factors is seen in the large HIV prevalence differences between the two groups: among brothel-based sex workers prevalence peaked at roughly 30%, whereas in other sex workers the peak was only 10%.

These definitional complexities merit presurveillance assessments in order to inform decisions on which subgroups to include in surveillance and how to sample them. Such assessments may take the form of qualitative research with key informants and members of the population subgroup in order to gain enough preliminary information to determine whether to initiate more expensive surveillance surveys. They may also include subanalyses of national surveys to indicate which population subgroups are at increased risk and potentially fuelling the epidemic. Where epidemics are driven by commercial sex work, IDUs, MSM, or mobile populations, presurveillance assessments provide information on where potential hubs of HIV prevalence and risk behaviours are and how population subgroups are organised, so that operational definitions and eligibility criteria can be determined.

Surveys typically attempt to build samples that are representative of the study population. However, with many population subgroups of import to HIV epidemic spread, the study population can be defined by not one, but often several varied parameters that have different epidemic implications. In the case of MSM, for example, the complex issues of behaviour and identity have been well documented, convincingly demonstrating that population characteristics may be different depending on definition.25–27

Self-report biases notwithstanding, a survey representative of the adult male population can provide an estimate on the percentage of men having engaged in same sex behaviours. However, in terms of HIV and STI epidemic dynamics, these men may be far less relevant than the subset of men who regularly meet each other in environments such as saunas or parks where there is often high sexual partnership turnover and the simultaneous selling of sex. Recent surveys have shown rising HIV prevalence among different subsets of MSM.6,7

For IDUs, those who obtain drugs and/or inject in public places may be at greater risk of sharing needles, and therefore of HIV infection, than those who inject at home. However, this distinction varies by setting and can only be verified from research obtained through careful qualitative and quantitative assessments.28

Clients of sex workers pose similar definitional challenges for surveillance. Ultimately, representative measures of what proportion of men in a population buy sex (or engage in multiple partner sex) can only be obtained through general population surveys. However, in settings where this proportion is relatively small (less than 10%), a more cost effective method of measuring changes in risk levels is to obtain them from subgroups of the male population who visit sex workers disproportionately often.

Many countries, among them Thailand and Cambodia, have included male population subgroups, such as the military and police, in their surveillance in order to monitor trends in men who engage in commercial sex at high rates of frequency.10,29 Clearly, these groups do not represent the general population, but their increased risk above and beyond that in the general population merits systematic and regular tracking of HIV, STI, and risk behaviours. Although surveys specifically of sex worker clients are feasible and have been conducted in numerous settings, they miss an important opportunity to measure change in the prevalence of commercial sex contacts. Including only those men who report paying for sex means that monitoring reductions (or increases) in commercial sex in the sample is impossible. Using again the examples of Thailand and Cambodia, the surveillance of several male population subgroups over time indicated a significant decrease in the proportion of men who reported commercial sex, and this was a major factor in epidemic decline in these countries.


Until recently, surveys of MSM, IDUs, and FSWs used convenience sampling methods, without understanding how these subpopulations are structured and whether subgroups may exist within them that are important for surveillance coverage or a prevention focus. However, replicable sampling strategies are crucial for measuring biological and behavioural trends.30–32 They must be capable of accurately defining sampling frames and eligibility criteria and yield samples that are selected on probability principles. If they are not, observed changes over time may be methodological artefacts that are the result of differences in sampling procedures in successive survey rounds rather than real changes in the underlying population.30

Sampling procedures for surveys of the general population (that is, household surveys using multistage cluster sampling) are well established and accepted.32 By contrast, researchers have struggled to find methods to build representative sampling frames of the population subgroups mentioned above. No formal lists exist of locations where commercial sex or drug use occur, and so mappings based on assessments or ethnographies need to be constructed prior to surveys in order to devise sampling frames of where groups reside, meet, or work. However, because of the often clandestine or illegal nature of these groups’ behaviours, these mappings are often incomplete or superficial. Even if accurate at one point in time, they soon become outdated in their characterisation of locations, as group members are often pursued by police and therefore highly mobile.

Early survey and surveillance efforts relied largely on non-probability sampling methods such as convenience and snowball sampling.33–35 This evolved to a greater reliance on two forms of probability sampling—cluster sampling and time location sampling—in the mid to late 1990s.36,37 Despite this advance, certain key subpopulations remained, notably those whose members tend not to congregate in identifiable and accessible locations, and they were not adequately represented by these sampling methods. Recent methodological advances in one type of chain referral sampling, namely respondent driven sampling (RDS), appear promising as an alternative sampling approach for such “hidden” subpopulations.38–40

Respondent driven sampling is a member of a new class of sampling methods, termed “link-tracing/adaptive sampling designs”, that are designed to operate in settings where traditional probability sampling methods are not feasible.41 Like all chain referral approaches, RDS is premised on the assumption that members of subpopulations themselves can most efficiently identify and encourage participation by other subpopulation members. By introducing recruitment incentives and quotas to encourage long referral chains, it can be shown that the composition of samples with regard to key characteristics and behaviours using RDS approximates that of the larger subpopulation and that the ultimate sample composition is independent of the characteristics of the initial “seeds”.42 RDS also uses post-stratification weights to compensate for “homophily” (that is, the propensity for study subjects to recruit others like themselves) and differences in personal network size. Furthermore, although it is a form of chain referral sampling, it can be shown that RDS is, in fact, a probability sampling method, with procedures developed to calculate sampling errors associated with survey estimates based upon RDS.42

Respondent driven sampling is currently undergoing further testing both in the US and in developing country settings. Applications of RDS in the US suggest that RDS captures members of “hidden” populations that other commonly used sampling methods miss. On the basis of the available information, RDS would appear to be the preferred choice for monitoring behavioural trends among respondent groups that do not often congregate—for example, IDUs and, depending on the results of the ongoing pilot tests, for other subpopulations for which time location sampling is currently being used as well.

Although RDS appears promising in providing a means for reaching “hidden” subpopulations, it is not necessarily the best choice for all groups. For example, brothel based FSWs tend to be associated with brothels in a relatively stable manner and thus may be efficiently sampled from a list based on clusters or using time location sampling. On the other hand, FSWs who work on street corners cannot be effectively sampled in this manner, and therefore either time location or RDS would be the preferred choice.

The presurveillance assessments mentioned above are necessary in order to delineate different subgroups and to provide information on how they can best be accessed and sampled. Once a subgroup has been identified as meriting surveillance coverage in a given setting, the choice of sampling strategy is guided by four considerations:

Figure 1 indicates the recommended sampling approach based upon the answers to these questions.

Figure 1

 Criteria for choosing a sampling method.


Efforts to propose more standardised methods to monitor HIV, STI, and risk behaviours as well as standardised behavioural indicators across countries have been fractious.3,43–45 The debate reflects a perhaps quixotic search to obtain a small set of predictive indicators of what is increasingly shown as a complex and often locality specific relation between sexual behaviours and HIV and STI transmission.

Cross sectional surveys have generally not provided the associations between biological and behavioural variables that many had hoped to see. Condom use has gone through rigorous testing of its measurement and its relationship to HIV and STI outcomes, most of which has shown disappointingly low associations.46–49 The four city study of the dynamics of the epidemic in Africa also showed that very few of the measured behaviours predicted HIV or STI prevalence in the study sites.50,51 Similar findings have been common in many of the integrated surveys of sexual behaviours and HIV and STI prevalence. This lack of correspondence between behaviour and infection should come as no surprise. The historical and temporal influences on prevalent HIV and STI are not easily captured in one cross sectional survey but require years of monitoring from either multiple cross sectional surveys or longitudinal designs. The probability that risk behaviours lead to HIV and STI transmission is a combination of multiple factors including the presence of overlapping or concurrent sexual partnerships, HIV and STI prevalence levels of those networks, rate of partner change, and type of partner.52–54 This multitude of factors cannot be investigated in one single investigation but requires multiple methods and triangulation over time.

Recent research among high risk population groups has shown the importance of specifying types of sexual partnerships and condom use within and across partnerships. Men are far more likely, for example, to initiate condom use with FSWs than with other non-regular sex partners or regular partners—for example, wives.55–57 Furthermore, epidemics can easily cross over from one high risk group to another, and this needs to be captured in behavioural surveys. A survey of IDUs in Indonesia, where in some sites almost half are HIV infected, showed that over two thirds were sexually active, over half reported multiple sexual partners, and 40% had bought sex from an FSW in the past year.58

The time frame reference of risk behaviours in questionnaires has also recently come under more scrutiny. Some researchers have argued that last event measurements based on “critical incident” methodology are more reliable than asking respondents to report on aggregated or cumulative behaviours over a given time period.43,44 Very recent events (such as whether a condom was used at last sex) tend to have the best recall by respondents but they are not representative of all general exposure over a longer period of time. In addition, the measurement of whether a needle was shared at any time “in the last day or week” is very different from “over the last six months”. If the behaviour is very common, measurements over a six month period will not be as sensitive to change as those based on a shorter recall period. Conversely, if the behaviour is rarer, measurements covering a more recent time frame may not capture exposure sufficiently.

Multiple measurements of the same behaviour will help to compensate for the weaknesses of a singular measure. For example, in measuring condom use among FSWs and their clients, questionnaires that contain both most recent use of condoms with the last paying partner, as well as some type of measurement of consistent use over a period of time, are capturing complementary data. Both measures are imperfect, but the strengths of one offset the weaknesses of the other.


Improvements in HIV and STI diagnostics now make testing more convenient to implement and samples easier to store and transport to distant laboratories.59,60 Saliva based HIV tests offer respondents who would prefer not to provide a blood sample an opportunity to participate in surveys and to receive results. These developments have led to an increase in integrated surveys of HIV, STI, and risk behaviours at the population level.61

The analysis of the triad of HIV, STI, and risk behaviours in key population groups is critical for understanding biological and behavioural dynamics. HIV epidemics have complex temporal dynamics, and prevalence is only a measure of cumulative risk and exposure to HIV, not of current risk. In many populations, behaviours are changing rapidly over time, both in response to the epidemic and to prevention efforts. STIs are changing dynamically as treatment accessibility improves and condom use rises. Unless longitudinal designs are used, the influence of each of these factors on current HIV prevalence will be difficult to ascertain. This situation is further complicated by the fact that every epidemic consists of multiple subepidemics, both at a subpopulation and geographic level, with significant variation within and between them.62

Models must be sophisticated enough to address these dynamics but at the same time be simple enough for input data needs to be met. This is particularly the case with the increasing scrutiny and discussion regarding the effects of interventions and the contribution of different types of behaviour change to epidemic decline.63 In Asia, a variety of models have been applied that incorporate HIV, STI, and risk behaviour parameters. In Cambodia, for example, a simple model of the epidemic was built by fitting ANC prevalence over time with a curve fitting model constructing male infection levels from data on male:female ratios, and then estimating incidence from the time trends and progression from HIV infection to death. This model gave incidence trends that were consistent with what would be expected given the increasing condom use and declining visits to sex workers reported in behavioural surveillance.10

The use of more sophisticated mathematical modelling of HIV, STI, and risk behavioural data over time offers opportunities for projections, associations between behavioural change, and HIV/STI increase or decline, as well as cross validations of biological and behavioural measurements.64 Both Thailand and Cambodia have used data from combined biological and behavioural surveillance in a recently developed model, known as the Asian Epidemic Model. This model illustrated shifts in infection patterns over time, and the contribution of condom use increases, and reductions in commercial sex partnerships to overall epidemic decline.10,29

In Thailand, modelling efforts suggested that behavioural change has averted approximately 5.7 million HIV infections since 1990. The Asian Epidemic Model also demonstrated that this has been primarily due to increases in condom use and, secondarily, to observed reductions in the percentage of men who visit sex workers. Furthermore, as one infection route is reduced through prevention, other routes begin to constitute a greater portion of overall transmission. In the year 2000, one fifth of new HIV infections in Thailand were estimated to stem from needle sharing among drug users, highlighting the need for more intensive harm reduction interventions in this group.29

In concentrated and low level epidemics, modelling can be critical in determining how subepidemics within high risk groups impact upon the general population. Applying the model to a typical Asian setting, researchers have explored different scenarios of potential HIV transmission by IDUs to non-injecting sexual partners and beyond.23 They found that in the early stages of a heterosexual epidemic, maintaining HIV prevalence at low levels among IDUs averts a substantial number of infections in the population of non-injectors by delaying the start of a commercial sex epidemic. In more advanced heterosexual epidemics, preventing HIV among IDUs still averts sizeable numbers of infections.


By their very nature, surveillance surveys are reductionary and cannot capture all of the behavioural and biological dynamics occurring in populations. Clearly, other forms of ad hoc quantitative and qualitative research are needed by a country not only for decision making but to update and refine surveillance systems as the epidemic evolves. In our view, the distinguishing characteristic of “bare minimum” surveillance is that it should, as a whole, directly and regularly inform national estimates, projections, and policy making. Continuing questions regarding potentially newly emerging routes of the epidemic and/or validity issues are natural offshoots of a country’s surveillance and should be answered by additional research that leads to changes and updates in the overall surveillance design.

Beyond the methodological issues we have cited and addressed above, high risk population groups such as FSWs and their clients, MSM, IDUs, and migrants who cross international borders continue to experience intense stigma and discrimination in most countries, and there is often little political support for addressing their role in HIV epidemics. Indeed, their existence is often denied by governments or, if it is recognised, they are persecuted—a situation highly unfavourable to reducing HIV epidemics in these populations.

A prevention focus on these groups is often prioritised only when it is shown that HIV transmission among them may lead to increased infection in the general population. The underlying message—often translated to inaction—is that such population groups are not valuable to society and only merit prevention intervention if they are conduits of infection to more “worthwhile” members of society. But the reality is that these populations form a significant part of the general population. Taken together, FSWs, their clients, MSM, IDUs, and their immediate sexual partners constitute more than 5% of the adult population in many countries, and a failure to address their prevention needs will place a serious yet avoidable burden on their societies in the future. The removal of stigma and discrimination from historically disenfranchised groups requires dramatic and long term social change, and one component of that change is to ensure adequate estimates on disease prevalence (in this case HIV and STI) and behavioural risks and their trends over time.


Funding for this paper was partially provided by the United States Agency for International Development (USAID) through the IMPACT Project managed by Family Health International (FHI). Views expressed are not necessarily the views of USAID, FHI, or the East-West Center.


View Abstract


  • S Mills and T Saidel conceptualised this paper and wrote most of the manuscript. R Magnani contributed theoretical and methodological aspects of the section on sampling. T Brown provided data and modelling applications from Thailand and Cambodia. All authors contributed to conclusions and implications.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.