Table 1

 Summary of important factors to consider when choosing an evaluation design (details in supplementary table S1, available on the STIwebsite:

Strength of evidence*13DesignComments
*According to Habicht et al.13.
Supplementary fig S2 is available on the STI website (
Was the intervention effective? → Strong causality statement
Probability:13–15 Demonstrate, with a high degree of certainty, if the intervention was a causal determinant in the improvement of the primary indicatorsExperimental design: Community-based randomised controlled trials14Parallel design: Communities are randomised and allocated at the start of the trial between intervention and control armsEmpirical estimates of incidence needed
Stepped-wedge design:15,16 Each community receives the control and the intervention sequentially, at randomly allocated time points during the trialHigh rates of loss to follow-up among high-risk cohorts, especially with long follow-up
Large cohorts needed to measure differences in incidence in the general population
Intervention less likely to be “real world”
May be unethical as it delays the roll-out of the intervention to the control group
May still be unethical if it (stepped-wedge design) increases the trial duration and slows the scale-up of the intervention
Did the programme seem to have an impact? → Medium to weak causality statement
Plausibility:13,21,22 Demonstrate, with a certain level of uncertainty, whether the programme may have had an effect above and beyond other external influencesQuasi-experimental design: Non-randomised valid control group to assess what might have happened in absence of the interventionInternal control group: Population at baseline (pretest–posttest type design)21,22No randomisation
External control group: From areas where the programme has not been implementedIntervention more likely to be “real world”
Multiple baseline interrupted time series: Pretest–posttest with more than two communities repeatedly assessed over time (ideally >50 time points), before and after the (non-randomised) interventionMore validity threats (eg selection biases, different sample characteristics, etc) than with experimental design
Internal control group: Sub-groups of the population receiving the intervention who have remained completely or partly unexposedDo not take into account the transmission dynamics of infection
Simulated control group: Use transmission dynamics model to simulate control group under same conditions as in target population, but in absence of the intervention, using data collected at the start of the interventionStronger causality statement if results of intervention impact can be compared across many communities
Logistically difficult if multiple time points or communities are used
Additional considerations:
Assess individual-level impact only
Additional considerations:
Estimates of the overall population-level impact of behavioural modifications on HIV rates after the intervention
Estimates of the impact of the intervention, and of other contributing factors (see supplementary fig S2)
Impact assessment takes into account the transmission dynamics of the epidemic
Stronger causality statement
Did the expected change occur? → Weak—no causality statement
Adequacy:13,21,22 Assess if changes in the expected direction in primary indicators have occurredObservational: No control group per seSurveillance of health indicators over time among the appropriate target populationsData necessary although mainly descriptive
Can only demonstrate that the trend is going in the desired direction
Intervention more likely to be “real world”