All the data in one single place

Download all the CORE data in a single package.

Matching your needs

Prototype, analyse and process your data directly on your infrastructure.

Largest full text collection

World's largest full text collection of scientific papers for machine processing

Simple to use

Accessible and easy to understand documentation and processes.

How it works?

How it works?

CORE data can be downloaded as a bulk dataset, allowing you to process it on your own computer or within your infrastructure. The dataset provides harmonised and enriched metadata and full text content aggregated by CORE delivered in a machine readable format. This is perfect for prototyping new methods, especially when intensive data processes need to be run. It is also a good choice for data analysis and text mining.

Access documentation
Eric Olson

Eric Olson

Consensus, co-founder and CEO

“To build the product we have always envisioned, having a robust and comprehensive dataset of machine-readable, peer-reviewed papers is absolutely essential. We are incredibly grateful to be able to partner with an organization like CORE that not only can meet our data needs, but also shares our vision of making science more accessible and consumable. This unique combination of best-in-class data-offering and mission-alignment makes CORE an ideal partner for Consensus.”

Dataset 2020-03-18

Full dataset (~400GB, 2.1TB Extracted)

Dataset 2018-03-01

Metadata only dataset (beta) (127 GB) - 123M metadata items, 85.6M items with abstract

With full text dataset (beta) (330 GB) - 123M metadata items, 85.6M items with abstract, 9.8M items with fulltext.

Documentation and access to previous datasets.

We aim to generate a new public dataset at least once a year. If you need a more recent dataset, please get in touch with us as we might be able to arrange it.

If you use CORE in your work, we kindly request you to cite one of our publications.

cite publication

What’s included

The Dataset provides you with:

  • The entire CORE's corpus of both metadata and full texts in a machine processable format.
  • Detailed documentation on how to download the CORE dataset and how data is organised.
What’s included
Latest

dataset

2022

year

datasets

2021

year

datasets

2020

year

datasets

2019

year

Register for the CORE Dataset

Enter your email address to register for our datasets or access the download page if you have already registered. Please enter your institutional email if you are registering in an institutional capacity.

We will send the instructions to this address

The terms of use for the dataset are available on our datasets download page.