Data Access

Social media research is powered by data, but understanding which data is available and how to access it can be a challenging task for researchers. Our Data Access Resource simplifies this process by outlining the types of data offered by each platform, their vetting procedures, and how these connect to the data access obligations under the Digital Services Act.

ELIGIBILITY

Find out who can apply for access to platform’s data and under what conditions.

ACCESS MODALITIES

See the different ways data can be accessed, from APIs to secure environments.

vetting process

Understand the steps and requirements for getting your research application approved.

Data

See the range of data available, from user and content data to engagement metrics and metadata.

Under Article 40(12) of the Digital Services Act (DSA), very large online platforms (VLOPs) and search engines (VLOSEs) must provide researchers, including those from non-profits, with real-time access to publicly available data without undue delay. Before, platforms like Meta offered such access voluntarily through tools like the now-defunct CrowdTangle, but the DSA has made it a mandatory obligation with certain standards. This change has also influenced smaller platforms, many of which now provide dedicated programmes to facilitate data access for researchers. Accessing social media data often involves navigating platform-specific criteria and vetting processes. Researchers may also need to use technologies such as APIs, user interfaces, or Secure Processing Environments (SPEs). Only after meeting these requirements and receiving approval can they access the data for in-depth, systematic analysis.
Elegibility

Understanding who is eligible to request access to the platform’s data, and the conditions they must meet, is key before getting started. What are the requirements researchers should keep in mind?
Eligibility illustration

 

Since 2023, when DRI began tracking how social media platforms provide data access to researchers, the landscape has shifted significantly. Some platforms have introduced or revised tools to comply with Article 40 of the DSA (often limiting access to EU researchers) while others have expanded access for global research communities.

 

This overview summarizes the available research data access tools across five VLOPs and three non-VLOPs, along with key eligibility criteria. It includes not only programmes designed to meet Article 40(12) of the DSA, but also any initiative that enables researchers to retrieve platform data for research purposes.

 

 

Most of the Data Access Programmes analysed ask researchers to undergo a vetting process to gain access to data. The complexity and eligibility criteria of these processes vary.

 

🔴 TikTok’s VCE and API and LinkedIn’s Beta Researcher Access Program fully follow Article 40(12) of the DSA. They only allow research that helps detect or understand systemic risks in the EU. Because of this, these programs are not relevant for researchers based outside the EU.

 

🟡 Meta’s Content Library and API take a broader approach. They don’t restrict access to certain research topics and are open to researchers from almost all regions. This model goes beyond the DSA’s minimum requirements.

 

🟢 YouTube (Data API v3), Telegram, and Bluesky offer easier access through standard user accounts without formal vetting. This low-barrier model supports transparency and openness, making it ideal for exploratory projects or researchers without institutional backing.

 

While X also offers API access with just a user account, its prohibitively high fees make it largely inaccessible.

 

Programs that require vetting often exclude journalists by limiting access to academics or non-profits. This goes against the spirit of Article 40(12), which aims to ensure broad transparency of public data, and overlooks journalism’s key role in investigating online risks.
Access modalities

Platforms provide different technologies and protocols to access their data—from APIs to secure environments. Here’s a comparison of the available options, focusing on usability, data coverage, and technical complexity!
Access modalities illustration

 

Even after gaining access, researchers face another challenge: figuring out how to use the platform’s technical systems and rules to retrieve and analyze the data. Access is usually provided through tools like APIs, user interfaces, or secure research environments, each differing in how easy they are to use, how much data they provide, and how technically complex they are.

 

The first table below compares common data access methods by their accessibility and required technical skills. The second table outlines key limitations and usage quotas for each platform.

 

 

Meta and TikTok rely on controlled systems that restrict the downloading of raw data, allowing only aggregated results or research outputs to be exported. YouTube provides comparatively more flexible access through its Researcher Program and Data API, which offer scalable quotas for public data. In contrast, LinkedIn maintain more opaque access models, with unclear or undisclosed rate limits for their research programs.

 

Most platforms apply strict usage quotas and eligibility criteria. Meta limit downloadable data to high-follower accounts (25K+), while TikTok’s Virtual Compute Environment enforces daily caps on record retrieval. YouTube’s API allows broader access but still applies quota systems, and Bluesky, Reddit, and Telegram offer relatively open APIs with defined request limits.
Vetting process

Exploring the vetting process shows what’s required to have your research application approved, including key considerations from platforms’ Terms of Use, Data Agreements, and Transparency policies.
Vetting process illustration
The table below compares how VLOPs vet researchers seeking data access. It focuses on programmes created to meet Article 40(12) requirements. As with access methods, the vetting procedures differ significantly across platforms.

 

 

 

 

 

A key difference between platforms is how long they take to review research applications. Article 40(12) of the DSA requires that data be provided “without undue delay,” yet review times vary from 6 days to 6 weeks—when timelines are disclosed at all. X, for instance, provides no information, and based on DRI’s experience, its review process can take up to eight months.

 

The vetting process itself also varies in complexity. Some platforms require researchers to complete lengthy forms, agree to strict Terms of Use or Data Agreements, and meet additional conditions. At least two platforms require researchers to share their findings with the platform before publication. Others explicitly ban scraping, even though the European Commission has clarified that scraping for research purposes is allowed under Article 40(12).

 

Previously, transparency reports under the Code of Practice on Disinformation offered limited information on how many research applications were received, approved, or rejected. Most VLOPs included in our study have since opted out of these reporting obligations.
Data

Platforms provide different types of data —from user and content data to engagement metrics and metadata. Here’s a comparison of the available data endpoints and data points for each platform!
Data illustration

 

The table below highlights selected data points available on each platform. For this update, we refined our categorization of the types of data platforms provide. We now clearly distinguish between user data, content data, and interaction data, each of which may include metadata such as timestamps or location markers. We also added new categories, including AI-generated content labels, sponsored content labels, and edit indicators showing when content has been modified.

 

 

 

Glossary and Methodological Note

 

 

By reviewing API codebooks (the documents where platforms describe the data they provide) we identified what information is available to researchers. Because we rely on what platforms claim to offer, we cannot confirm the actual quality or consistency of the data researchers receive.

 

We found codebooks for every platform except LinkedIn, for which we were unable to locate detailed API documentation.

 

Overall, in 2025, platforms are expanding the number of data endpoints and datapoints they make available. However, major gaps remain:

 

  • • Popular formats like shorts, reels, and stories are still not accessible to researchers.
  • • Some platforms do not provide URLs for posts or comments, making it hard to verify API data against what appears on the platform.
  • • Researchers may be allowed to view posts and comments but are not able to download them.
  • • Most platforms provide data through Secure Research Environments, which substantially restrict the types of analyses researchers can conduct. These environments often require VPN access or monitor researcher activity, raising valid concerns about platform oversight.
  • • Among the VLOPs we reviewed, only YouTube provides data on labels for AI-generated content, even though all major platforms host such content and say they label it.
  • • Only Meta and YouTube provide data on sponsored content labels.
  • • TikTok and YouTube still do not provide data on verified account status.

 





 
Related resources
 

 

The Data Access Problem: Limitations on Access to Public Data on VLOPs

See resource

Decoding Access to Social Media Data: Insights from the CoP Compliance Report

See resource

What the Scientific Community Needs from Data Access under Art. 40 DSA

See resource

DRI’s Feedback to the Delegated Regulation on Data Access

See resource

Why the EC should issue guidance on access to publicly available data

See resource

DSA in Court: Key takeaways from DRI’s data access case against X

See resource

Key findings and Recommendations on our Data Access overview 2025

See resource