IRB for Online Research: Surveys, Social Media, and Digital Data
Online research looks simple from the outside — a Qualtrics link, a Reddit scrape, a Twitter dataset — and that appearance trips up more first-time applicants than any other study design. The regulatory definitions at 45 CFR 46.102 apply online exactly as they do in person, but the operational questions (what counts as public? when is consent required? how do you verify adulthood?) are not resolved by regulation alone. Reviewers have developed norms, and knowing them saves a revision cycle.
Online surveys
Anonymous online surveys with adults on non-sensitive topics are usually exempt. The typical design: a consent page that opens the survey, a click-through that serves as documentation, no identifiers collected. Use a waiver of documentation under 45 CFR 46.117(c), cite it explicitly in your protocol, and label the request correctly.
If your survey collects identifiers — even email addresses for follow-up — the analysis is no longer anonymous and review often moves to expedited. Keep identifiers separated from response data in a linked file, and describe the separation in your data management plan. Our data management plan template shows the expected structure.
Sensitive topics — mental health, substance use, immigration status, illegal behavior — push online surveys to expedited even if identifiers are not collected, because the disclosure risk if data were breached is non-trivial. Plan accordingly.
Social media data
The question that dominates social media research is whether a post is "public" in the regulatory sense. A public Twitter post is technically observable by anyone, but participants did not post it anticipating research. Reviewers apply a reasonable-expectation-of-privacy test: would a typical user expect this content to appear in a research paper? For public figures and clearly public forums, the answer is often yes; for health communities, addiction recovery groups, and similar spaces, the answer is usually no, even if the forum is technically open.
Document your reasoning in the protocol. If you are collecting content from a space with any reasonable privacy expectation, plan for de-identification of quotes (paraphrasing rather than direct quotation is sometimes required) and avoid reporting usernames.
Digital trace data
Trace data — clickstream, app usage, wearable sensor data, search logs — is rarely anonymous even when it lacks obvious identifiers. Patterns in the data often re-identify individuals. Treat trace data as identifiable for IRB purposes unless you have strong evidence otherwise.
If you are using secondary data from a platform or a third-party aggregator, confirm the platform's terms of service permit research use and document the provenance in your protocol. Secondary-data studies have specific exempt categories under 45 CFR 46.104(d)(4) — cite the correct one.
Consent online
Three practical design choices matter:
- Key information at the top. The 2018 Common Rule revisions require a concise summary before the full consent. Online, this is the first screen — one or two short paragraphs that tell a participant what the study is, what they will do, what it will take, and what the risks are.
- Click-through confirmation. Require an explicit action before survey items are revealed. Radio buttons labeled "I agree" and "I do not agree" are standard.
- Printable version. Offer a downloadable PDF of the full form.
For consent language specifically drafted for online studies, the consent form guide includes sample paragraphs and the consent form template has the required structure.
Verifying adulthood
If your protocol excludes minors, you cannot just say so — you have to have a plan to enforce it. Standard approaches are self-reported date of birth, a gating question with a skip logic that ends the survey for minors, and placement of recruitment materials on adult platforms. None of these are airtight, and reviewers accept them as reasonable mitigation rather than proof. For research where the distinction matters substantially, a more robust verification (payment system, account verification) may be required.
Data security online
Describe, specifically:
- The platform used (Qualtrics, REDCap, SurveyMonkey, custom application).
- Encryption at rest and in transit.
- Who at your institution administers the account.
- Where data is exported and how it is stored locally.
- How IP address collection is disabled (most platforms collect IPs by default — turn this off in the survey settings and confirm in the protocol).
For tool-assisted study design that complements the IRB view of online research, Subthesis research tools covers research methodology in a way that maps cleanly to IRB protocol sections.
Recruitment on social media
Recruiting through social media is generally fine; attach the exact text of every post, including images. Reviewers want to see what the participant will see. For snowball recruitment, describe the mechanism and acknowledge that response rates cannot be calculated.
The synthesis
Online research is not lower-risk by default; it is differently risky. Confidentiality breaches are more likely, consent is harder to verify, and the privacy expectations of participants are less well defined. Reviewers know this, and a protocol that names the risks specifically — rather than dismissing them — moves faster than one that treats "online" as a synonym for "low risk."