The use of web-based diaries in sexual risk behaviour research: a systematic review
1. Carolyn Stalgaitis,
2. Sara Nelson Glick
1. Department of Epidemiology and Biostatistics, The George Washington University, Washington, District of Columbia, USA
1. Correspondence to Dr Sara Nelson Glick, The George Washington University, 950 New Hampshire Ave NW, Suite 500, Washington, DC 20052, USA; snglick{at}gwu.edu

## Abstract

Background An increasing number of studies have used the diary method, which provides quantitative event-level data about sexual encounters. Diaries are an attractive tool for sexual behaviour research, yet little is known about the range of uses, methodological issues and best practices associated with this technology.

Objectives To conduct a systematic review of the literature regarding the use of web-based diaries in sexual risk behaviour studies.

Design Systematic review.

Data sources Five bibliographical databases, supplemented by references from previous reviews.

Methods Eligible studies were published in English before August 2013, used the internet to transmit data from collection device to study staff, and measured behaviours affecting HIV or sexually transmitted infection transmission risk. The primary author conducted an initial screen to eliminate irrelevant articles. Both authors conducted full-text reviews to determine final articles. We abstracted data on diary methodology, validity and reactivity (behaviour change caused by diary completion).

Results Twenty-three articles representing 15 studies were identified. Most diaries were collected daily for 1 month via websites, and completion was generally high (>80%). Compensation varied by study and was not associated with completion. Studies comparing diary with retrospective survey data demonstrated evidence of over-reporting on retrospective tools, except for the least frequent behaviours. Most studies that assessed reactivity as a result of diary completion demonstrated some change in behaviour associated with frequent monitoring.

Conclusions Web-based diaries are an effective means of studying sexual risk behaviour. More uniform reporting and further research on the extent of reactivity are needed.

## Introduction

One of the greatest methodological challenges in HIV and sexually transmitted infections (STIs) research is obtaining valid measures of sexual behaviour. Increasingly, sexual behaviour studies use the diary method, which provides quantitative event-level data about sexual encounters. Although potentially burdensome to participants due to the time and effort required to complete diaries, their frequent nature confers advantages over methods such as retrospective (eg, 30-day) surveys, including improved recall.1 Diaries typically collect daily data such as number of partners,2 mood3 or STI symptoms,4 which may be more precise than aggregate measures collected in retrospective surveys.1 ,5 ,6 Event-level data may also allow for more accurate identification of predictors of risky behaviour and HIV/STI transmission.6

These advantages make diaries an attractive tool for sexual behaviour research. Historically, paper diaries were most common, despite several shortcomings. Compliance with the diary schedule is difficult to confirm, and participants may hoard diaries and complete multiple entries at once, increasing the potential for measurement error.7–9 Participants in paper diary studies often must carry physical diaries with them, and may forget to complete them without frequent reminders from study staff.1 ,10 Additionally, paper diaries require staff to physically enter responses into a database, leading to increased staff burden and a high potential for data entry errors given the volume of responses collected in most diary studies.1 ,11

In response to technological advances and the limitations of paper diaries, researchers have increasingly employed electronic diaries. Electronic diaries can be collected by email, website, personal digital assistant (PDA), phone or other means of electronic data recording. This allows diaries to be time-stamped, and may be more convenient for participants.1 ,10 ,12 Electronic diaries can implement skip patterns to reduce participant burden, identify incomplete or out-of-range responses in real time and reduce data entry errors.1 ,10 ,12 The electronic format may also increase privacy.1 ,12 This is an important consideration for HIV/STI diary research given that diaries may collect partners’ initials or names, thereby increasing risk of confidentiality breach.

Electronic diaries, especially formats that permit remote data collection, possess enormous potential for sexual behaviour research. However, little is known about the range of uses, methodological issues such as reactivity (behaviour changes that result from study participation) and best practices associated with this technology. To inform the use of diaries in future studies, we conducted a systematic review of the published literature on web-based diaries used for sexual risk behaviour research. Specifically, we examined the following characteristics of these studies: objectives; study populations; diary collection methods and frequency; variables measured; compensation; approaches to missing data; diary completion rates; participant acceptability and privacy; diary and survey data validity; and evidence of reactivity or behaviour change as a result of diary completion.

## Methods

### Search strategy and selection criteria

A systematic review of the literature was conducted in August 2013 to identify sexual risk behaviour studies that used web-based diaries. We searched five databases (PubMed; EMBASE; CINAHL; MEDLINE; Cochrane Library) to identify relevant articles using the following search terms: (‘diary’ OR ‘diaries’) AND (‘sexual health’ OR ‘sexually transmitted’ OR ‘STI’ OR ‘STD’ OR ‘sexual behaviour’ OR ‘HIV’ OR ‘AIDS’). Studies published in August 2013 or earlier were eligible for inclusion.

Studies meeting the following criteria were included in our review: (1) used web-based diaries; (2) measured sexual risk behaviours; (3) published as a peer-reviewed article by August 2013; (4) English language. We excluded poster or presentation abstracts, case reports, letters to the editor, opinion articles and reviews. Studies could be of any design, but the diary methodology inherently limited results to prospective studies.

We defined web-based diaries as those that used the internet to transmit data from participants to study staff. This included website-based and email-based diaries, along with cell phone or PDA diaries that transmitted data over the internet. Studies that transferred data differently such as through the physical exchange of memory cards were excluded.

To identify articles most relevant to HIV/STI research, we focused on studies that measured HIV/STI incidence and/or examined risk factors for transmission. We included studies that examined condom use, whether or not they explicitly studied incidence, because of the importance of condoms in preventing transmission. Studies of sexual pleasure, contraception other than condoms and sexual dysfunction were excluded.

### Study selection and data extraction

From our database search, we generated a list of relevant citations and identified additional citations from references. The primary author scanned articles to exclude citations obviously irrelevant to our review such as studies of erectile dysfunction, sexual pleasure or contraception other than condoms. Both authors reviewed the remaining articles, abstracted data and integrated results to determine the final set of articles. Inconsistencies were resolved by discussion between authors. To the extent possible, we collected the following information from each study: objectives; diary medium, frequency and collection period; study population; sample size; variables measured; participant acceptability and privacy; compensation; and how missing data were handled. Information was also abstracted on diary completion rates, validity of diary data and reactivity. We included validity data because diaries are considered the gold standard for sexual behaviour research, due to the inability of researchers to directly observe sexual behaviour and the lack of confirmatory biological measures.13–15 As a result, diaries are compared with less resource-intensive methods, such as retrospective surveys, to assess the validity of alternative methods. We also examined reactivity, the phenomenon whereby recording a behaviour causes subjects to change that behaviour, because its occurrence may bias observational diary studies.1 ,16 We did not conduct a standardised bias or quality assessment as we were interested in studies’ methods rather than conclusions. Instead, the data we collected—including completion rates, validity and reactivity—were indicators of study quality and potential biases that assisted us in identifying optimal methodological practices.

## Results

Figure 1 presents a flowchart of the screening process. The initial database search yielded 1440 unique records. An additional 245 citations did not include the specific search terms but were identified from references in reviewed articles. The primary author deemed 358 citations potentially relevant based on their titles and reviewed their abstracts to exclude ineligible studies. This yielded 35 articles for full-text review by both authors. Upon closer examination, we excluded 2 articles that used PDAs but did not transmit data using the internet17 ,18 and 10 articles that did not measure sexual risk behaviours as per our eligibility criteria.19–28 This left 23 articles representing 15 studies that used web-based diaries to examine sexual risk behaviour.2 ,3 ,29–49

Figure 1

Search and selection flow chart for a systematic review of web-based diaries in sexual risk behaviour research.

### Objectives of studies using web-based diaries

Article objectives ranged from identification of HIV/STI risk factors to methodological research. Unprotected sex was a common focus; four articles examined a range of predictors and correlates of condom use,35 ,36 ,39 ,48 while others focused on specific risk factors such as substance use (n=4)30 ,38 ,40 ,43 or mood (n=2).34 ,41 Four assessed rates and predictors of human papillomavirus infection in college students.29 ,31 ,45 ,49 Others examined racial differences in risk behaviours of men who have sex with men (MSM),44 online partnering among MSM42 and condom breakage/slippage.46

Several studies focused on methodological issues, including the feasibility of using diaries with specific populations33 and high-frequency diary schedules.3 One compared web-based diaries with text message and paper diaries,2 while another compared diary schedules.33 Four assessed the validity of retrospective recall surveys.2 ,32 ,33 ,37

### Methodological characteristics

Table 1 presents descriptive characteristics of the reviewed studies (see online supplementary table S4). The most common diary collection method was a website (n=13)2 ,29–31 33–49 and the most common submission frequency was daily (n=8).32 ,34–42 ,46 ,47 One study, which used cell phones to collect data, administered diaries thrice daily.3 Seven studies collected diaries for 1 month.32 ,34–42 ,46

Table 1

Design and methods of sexual risk behavior studies using web-based diaries

The most frequently studied populations were MSM (n=7)30 ,33–35 ,37 ,40–44 ,48 and university students (n=5).29 ,31 ,32 ,38 ,39 ,45 ,47 ,49 Sample sizes ranged from 387735 to 3732; seven studies had fewer than 100 subjects.2 ,29 ,32–34 ,37 ,47 ,49

We attempted to enumerate the frequency with which diaries measured specific variables based on information provided in each manuscript. The most common measures were condom use (n=13)2 ,29 ,30 ,32–44 46–49 and partner type (n=12).2 ,29–31 33–46 ,48 ,49 Studies assessed partner type with varying specificity, including new versus repeat,2 ,29 ,31 ,33 ,37 ,40–45 ,49 casual versus main2 ,30 ,34 ,35 ,38 ,39 ,48 or even further detail (ie, girlfriend, spouse, etc).36 ,46 Diaries were also used to measure the type of sexual activity that occurred (oral, vaginal or anal sex),3 ,29–37 40–46 ,48 ,49 partner-specific rather than aggregate behaviours2 ,3 ,29 ,31–34 ,37 ,40–42 ,45 ,49 and substance use.3 ,30 ,34 ,36 ,38–44 ,46 ,48

Compensation for participation varied. Three studies paid participants per diary entry3 ,29 ,33 ,49 and three paid per entry with a bonus for high completion.2 ,34 ,40–42 Average compensation per diary day was $1.18 (range:$0.14\$–3.00). The cellphone-based study, in addition to financial compensation, provided participants with free domestic phone calls, text messages and phone-based internet access during the study, with the option to retain the phone at study completion.3

Missing data can introduce bias into a study if appropriate analytical steps are not taken.50 Among studies that reported how they treated missing data, three excluded participants with missing data,3 ,40–44 three used modelling techniques that accounted for missing data29 ,34 ,35 ,49 and one used imputation.33 Missing data are especially relevant for studies comparing diary and recall data; only one of the four studies that did so addressed data missingness, using imputation.33

### Diary completion rates

No single measure of diary completion was reported by all studies (table 2). Regardless of the measures provided, most studies reported relatively high completion. Nine reported that >80% of all diaries were submitted.2 ,3 ,29 ,31–33 ,37 ,40–45 ,49 Four separately reported the proportion of diaries submitted on time; two reported on-time rates >90%.32 ,47 To enable comparisons between studies, we divided the mean or median number of completed diaries by the number expected per subject to obtain a proportion. Four studies reporting means3 ,30 ,40–42 ,47 ,48 and four reporting medians2 ,3 ,30 ,34 ,48 had completion rates >80%. In a 12-month study of young MSM, Glick et al noted a substantial decline in completion after 6 months,33 however studies of a similar length in heterosexual populations did not report comparable declines.29 ,31 ,45 ,49

Table 2

Diary completion rates among sexual risk behavior studies using web-based diaries

We attempted to identify diary characteristics associated with high completion. For the studies in which appropriate data were available (n=9), we plotted the overall completion rate by compensation per diary day and found almost no association (R2=0.0047) (see online supplementary figure S2). We also compared completion rates by diary frequency and length of diary collection period, and found no significant differences.

### Participant acceptability and privacy

Despite concerns about participant burden,1 diaries were well-received. In three studies reporting acceptability, participants found electronic diaries to be convenient and enjoyed participating.2 ,3 ,37 Two studies reported that participants felt the diaries sufficiently protected their privacy.2 ,37

### Validity

Four studies used diaries to assess the validity of retrospective survey data and examined over-reporting and under-reporting in these surveys (table 3). Garry et al compared diary data with results from a surprise survey 6 months to 12 months later.32 On the survey, subjects under-reported number of partners and over-reported frequency of oral and vaginal sex and condom use. No difference was found for anal sex, the least frequent behaviour.

Table 3

Validity and reactivity findings for studies using web-based diaries

Horvath et al found that MSM significantly over-reported receptive oral and anal sex on a retrospective survey.37 The proportion of participants who correctly reported number of sexual episodes and oral sex was low. Recall was most accurate for unprotected receptive anal sex. Higher diary reports of total sexual episodes, giving or receiving oral sex, and partner ejaculation during oral sex were associated with over-reporting on the survey. Greater frequency of unprotected insertive anal sex was associated with under-reporting.

Glick et al assessed survey validity in a sample of MSM.33 Concordance correlation coefficients and κ statistics exceeded 0.80 for almost all sexual behaviours assessed, indicating considerable agreement between methods.33 Similarly, Lim et al randomised young adults to complete weekly diaries online, on paper or by text message, and compared results with a retrospective survey.2 Correlation was highest for the proportion of regular versus casual partners (0.87), mean frequency of sex (0.76), mean frequency of using condoms (0.76) and STI risk (0.74).2

### Reactivity

Five studies assessed reactivity (table 3), including one controlled study. Glick et al compared quarterly surveys from participants randomised to an active diary or control group.33 Compared with diary subjects, controls reported significantly greater increases over time in the occurrence of anal sex, frequency of anal and unprotected anal sex, and acquisition of new male partners. These behavioural differences may explain the significantly higher rate of incident HIV/STI diagnoses in control (26.1%) than diary subjects (4.8%, p=0.01).

Four studies assessed reactivity by analysing intrapersonal temporal trends in behaviour, without comparing with a control. In their cellphone study, Hensel et al noted that diary submission rates declined significantly each week, demonstrating completion reactivity, and reports of vaginal sex declined significantly over time, demonstrating behavioural reactivity.3 Horvath et al examined trends in behaviour in a diary study of MSM and found that giving and receiving oral sex, insertive anal sex, and receptive unprotected anal sex decreased significantly over time.37 Two studies concluded that their data showed no evidence of reactivity.38 ,39 ,43 ,44

## Discussion

Web-based diaries are increasingly popular for collecting detailed sexual behaviour data. To date, most studies have been small and not designed to assess methodological questions. There does not appear to be one single web-diary design that best measures sexual risk behaviour, and individual study needs should guide methodological decisions regarding diary frequency, medium and content. Nevertheless, our review identified interesting patterns and lessons learned across studies that implemented web-based diaries.

Diary completion rates in the reviewed studies were generally high. Nearly all studies provided compensation, but the amount of compensation was not associated with completion. While most collected data were based on daily behaviour, researchers succeeded using a variety of diary submission schedules. In one study that directly compared three submission schedules, the less frequent schedules had higher completion rates.33 Most diary studies lasted for 1 month, although some had success with more frequent (eg, thrice daily) or longer duration data collection. In a recent study of young MSM, the decline in completion rates over time may have resulted from higher frequency of sexual behaviour relative to other populations. This suggests that future studies should employ strategies to simplify data collection and encourage long-term participation. Inconsistent reporting of completion data and diary measures hindered our ability to further identify methods associated with improved completion. To enhance comparability, authors should at minimum report the total proportion of diaries received and should consider following the guidelines for reporting completion rates proposed by Stone and Shiffman,51 including reporting what constituted complete and on-time submissions.

Several studies used diaries to assess the validity of retrospective surveys. Condom use, partner type, frequency and type of sex were almost universally measured, as partner-specific and aggregate variables. Overall, correlation between diary and retrospective data was high, indicating that both methods likely provide valid estimates. However, accuracy tended to be greater for less frequent behaviours, and over-reporting on retrospective surveys exceeded under-reporting. Clearly, study aims and data needs should be considered when selecting a data collection tool. This appears to be especially true for high-frequency behaviours (eg, number of sex acts) where diary data likely provide more accurate count data.

It is difficult to draw firm conclusions about reactivity in web-based sexual risk behaviour diaries given the paucity of quality studies. The notion that ‘self-monitoring’ can generate behaviour change has been studied within the context of many health behaviours—notably in exercise and weight loss intervention research—although the findings of controlled studies have overall been equivocal. Nevertheless, among those studies evaluating the effect of diaries (web-based and non-web-based) on sexual behaviours, diary completion has been consistently associated with lower rates of sexual risk behaviour.33 ,52–54 Although further controlled research with more diverse populations is needed to confirm these findings, available data suggest that researchers using diaries should consider that this methodology may, in fact, be an intervention itself.

This review has two main limitations. First, we could not assess the presence of publication bias, but it is possible that web-based diary studies with non-significant findings or inadequate completion were less likely to be published. Second, the broad parameters used to identify articles minimised the risk of missing eligible records, but it is still possible that our strategy did not identify all relevant articles.

Our review highlights several areas for future research on web-based diaries specific to sexual risk behaviour. Research should identify subjects’ preferred diary mediums, frequency and length, and use diaries with populations beyond MSM and university students. Cellphone diaries possess largely unexplored potential, as apps for smartphones can collect data, transmit to study staff and enable staff to remind participants to complete diaries. This medium is particularly compelling for research and prevention efforts among traditionally hard-to-reach populations, as demonstrated in one study of low-income STI clinic attendees.3 In addition, phone-based diaries may be especially effective among young black MSM—a priority population for HIV/STI prevention research in the USA—given that smartphone use appears to be extremely common and accepted as an intervention tool.55 Studies should also examine methods, including compensation, that maximise completion. Reactivity studies should incorporate control groups to avoid conflating temporal trends with behavioural reactivity. These studies could also identify features that affect reactivity, such as frequency, length of collection or medium. Long-term studies should examine if reactivity is temporary or long-lasting. Finally, there is limited information on how missing diary data are handled analytically. Future research should examine if diary entries are missing at random, which would inform the use of imputation.

Web-based diaries provide many benefits, including remote data collection and reduced data entry burden. However, the potentially substantial costs of software, hardware and privacy requirements associated with implementing a web-based methodology indicate that researchers should consider their data needs and population when selecting a web-based diary.1 ,12 ,56 ,57 This tool is a promising technological advancement for HIV/STI research, and future studies should continue to employ rigorous study designs to clarify the most appropriate methodological practices.

### Key messages

• To date, most web-based, quantitative, sexual risk behaviour diaries have been daily, month-long studies collected using websites.

• Methods that maximise diary completion are unclear, partially due to non-systematic reporting of completion rates.

• Compared with diaries, retrospective surveys appear to collect valid measures of sexual risk behaviour, but may overestimate the frequency of more common behaviours.

• Reactivity due to diary completion may decrease sexual risk behaviour, but additional controlled studies are needed.

