Data Access
Social media research is powered by data, but understanding which data is available and how to access it can be a challenging task for researchers. Our Data Access Resource simplifies this process by outlining the types of data offered by each platform, their vetting procedures, and how these connect to the data access obligations under the Digital Services Act.
ELIGIBILITY
Find out who can apply for access to platform’s data and under what conditions.
ACCESS MODALITIES
See the different ways data can be accessed, from APIs to secure environments.
vetting process
Understand the steps and requirements for getting your research application approved.
Data
See the range of data available, from user and content data to engagement metrics and metadata.
Understanding who is eligible to request access to the platform’s data, and the conditions they must meet, is key before getting started. What are the requirements researchers should keep in mind?
Since 2023, when DRI began tracking how social media platforms provide data access to researchers, the landscape has shifted significantly. Some platforms have introduced or revised tools to comply with Article 40 of the DSA (often limiting access to EU researchers) while others have expanded access for global research communities.
This overview summarizes the available research data access tools across five VLOPs and three non-VLOPs, along with key eligibility criteria. It includes not only programmes designed to meet Article 40(12) of the DSA, but also any initiative that enables researchers to retrieve platform data for research purposes.
Most of the Data Access Programmes analysed ask researchers to undergo a vetting process to gain access to data. The complexity and eligibility criteria of these processes vary.
🔴 TikTok’s VCE and API and LinkedIn’s Beta Researcher Access Program fully follow Article 40(12) of the DSA. They only allow research that helps detect or understand systemic risks in the EU. Because of this, these programs are not relevant for researchers based outside the EU.
🟡 Meta’s Content Library and API take a broader approach. They don’t restrict access to certain research topics and are open to researchers from almost all regions. This model goes beyond the DSA’s minimum requirements.
🟢 YouTube (Data API v3), Telegram, and Bluesky offer easier access through standard user accounts without formal vetting. This low-barrier model supports transparency and openness, making it ideal for exploratory projects or researchers without institutional backing.
While X also offers API access with just a user account, its prohibitively high fees make it largely inaccessible.
Programs that require vetting often exclude journalists by limiting access to academics or non-profits. This goes against the spirit of Article 40(12), which aims to ensure broad transparency of public data, and overlooks journalism’s key role in investigating online risks.
Platforms provide different technologies and protocols to access their data—from APIs to secure environments. Here’s a comparison of the available options, focusing on usability, data coverage, and technical complexity!
Even after gaining access, researchers face another challenge: figuring out how to use the platform’s technical systems and rules to retrieve and analyze the data. Access is usually provided through tools like APIs, user interfaces, or secure research environments, each differing in how easy they are to use, how much data they provide, and how technically complex they are.
The first table below compares common data access methods by their accessibility and required technical skills. The second table outlines key limitations and usage quotas for each platform.
Meta and TikTok rely on controlled systems that restrict the downloading of raw data, allowing only aggregated results or research outputs to be exported. YouTube provides comparatively more flexible access through its Researcher Program and Data API, which offer scalable quotas for public data. In contrast, LinkedIn maintain more opaque access models, with unclear or undisclosed rate limits for their research programs.
Most platforms apply strict usage quotas and eligibility criteria. Meta limit downloadable data to high-follower accounts (25K+), while TikTok’s Virtual Compute Environment enforces daily caps on record retrieval. YouTube’s API allows broader access but still applies quota systems, and Bluesky, Reddit, and Telegram offer relatively open APIs with defined request limits.
The table above provides information about possible points of access that researchers can consider when designing their research. In 2024, we expanded our analysis to include new platforms, such as Threads and Bluesky—the latter experiencing significant growth in users. Reflecting the changes introduced by the DSA, this year’s focus is exclusively on data access programmes designed specifically for researchers, excluding APIs meant for developers or commercial purposes.
APIs remain the most common method for platforms to provide research access. However, other data access mechanisms like User Interfaces and limited scraping are also offered.
Most platforms provide data access without location-based restrictions, covering all countries and regions. Some exceptions exist, though. Platforms that grant access solely to meet DSA requirements, such as LinkedIn and X, may limit access to research focused on systemic risks within the European Union.
Exploring the vetting process shows what’s required to have your research application approved, including key considerations from platforms’ Terms of Use, Data Agreements, and Transparency policies.
A key difference between platforms is how long they take to review research applications. Article 40(12) of the DSA requires that data be provided “without undue delay,” yet review times vary from 6 days to 6 weeks—when timelines are disclosed at all. X, for instance, provides no information, and based on DRI’s experience, its review process can take up to eight months.
The vetting process itself also varies in complexity. Some platforms require researchers to complete lengthy forms, agree to strict Terms of Use or Data Agreements, and meet additional conditions. At least two platforms require researchers to share their findings with the platform before publication. Others explicitly ban scraping, even though the European Commission has clarified that scraping for research purposes is allowed under Article 40(12).
Previously, transparency reports under the Code of Practice on Disinformation offered limited information on how many research applications were received, approved, or rejected. Most VLOPs included in our study have since opted out of these reporting obligations.
Platforms provide different types of data —from user and content data to engagement metrics and metadata. Here’s a comparison of the available data endpoints and data points for each platform!
The table below highlights selected data points available on each platform. For this update, we refined our categorization of the types of data platforms provide. We now clearly distinguish between user data, content data, and interaction data, each of which may include metadata such as timestamps or location markers. We also added new categories, including AI-generated content labels, sponsored content labels, and edit indicators showing when content has been modified.
Glossary and Methodological Note
By reviewing API codebooks (the documents where platforms describe the data they provide) we identified what information is available to researchers. Because we rely on what platforms claim to offer, we cannot confirm the actual quality or consistency of the data researchers receive.
We found codebooks for every platform except LinkedIn, for which we were unable to locate detailed API documentation.
Overall, in 2025, platforms are expanding the number of data endpoints and datapoints they make available. However, major gaps remain:
- • Popular formats like shorts, reels, and stories are still not accessible to researchers.
- • Some platforms do not provide URLs for posts or comments, making it hard to verify API data against what appears on the platform.
- • Researchers may be allowed to view posts and comments but are not able to download them.
- • Most platforms provide data through Secure Research Environments, which substantially restrict the types of analyses researchers can conduct. These environments often require VPN access or monitor researcher activity, raising valid concerns about platform oversight.
- • Among the VLOPs we reviewed, only YouTube provides data on labels for AI-generated content, even though all major platforms host such content and say they label it.
- • Only Meta and YouTube provide data on sponsored content labels.
- • TikTok and YouTube still do not provide data on verified account status.
Building on our earlier analyses, we developed a methodology to evaluate platforms based on two key factors: the data points they make available and the vetting processes researchers must navigate. While platforms are making strides in offering more data points, the vetting process has become noticeably more restrictive and complex.
Another critical issue is the quality and consistency of the data provided, which can only be assessed once researchers gain access. While this was beyond the scope of our current analysis, it remains a vital area for future investigation.
Related resources
The Data Access Problem: Limitations on Access to Public Data on VLOPs
See resourceDecoding Access to Social Media Data: Insights from the CoP Compliance Report
See resourceWhat the Scientific Community Needs from Data Access under Art. 40 DSA
See resourceDRI’s Feedback to the Delegated Regulation on Data Access
See resourceWhy the EC should issue guidance on access to publicly available data
See resourceDSA in Court: Key takeaways from DRI’s data access case against X
See resourceKey findings and Recommendations on our Data Access overview 2025
See resource