Baselight

NIPS 2015 Papers

Explore and analyze this year's NIPS papers

@kaggle.benhamner_nips_2015_papers

About this Dataset

NIPS 2015 Papers

Neural Information Processing Systems (NIPS) is one of the top machine learning conferences in the world. It covers topics ranging from deep learning and computer vision to cognitive science and reinforcement learning.

This year, Kaggle is hosting the NIPS 2015 paper dataset to facilitate and showcase exploratory analytics on the NIPS data. We've extracted the paper text from the raw PDF files and are releasing that both in CSV files and as a SQLite database. Here's a quick script that gives an overview of what's included in the data.

We encourage you to explore this data and share what you find through Kaggle Scripts!

Data Description

Overview of the data in Kaggle Scripts.

nips-2015-papers-release-*.zip (downloadable from the link above) contains the below files/folders. All this data's available through Kaggle Scripts as well, and you can create a new script to immediately start exploring the data in R, Python, Julia, or SQLite.

This dataset is available in two formats: three CSV files and a single SQLite database (consisting of three tables with content identical to the CSV files).

You can see the code used to create this dataset on Github.

Papers.csv

This file contains one row for each of the 403 NIPS papers from this year's conference. It includes the following fields

  • Id - unique identifier for the paper (equivalent to the one in NIPS's system)
  • Title - title of the paper
  • EventType - whether it's a poster, oral, or spotlight presentation
  • PdfName - filename for the PDF document
  • Abstract - text for the abstract (scraped from the NIPS website)
  • PaperText - raw text from the PDF document (created using the tool pdftotext)

Authors.csv

This file contains id's and names for each of the authors on this year's NIPS papers.

  • Id - unique identifier for the author (equivalent to the one in NIPS's system)
  • Name - author's name

PaperAuthors.csv

This file links papers to their corresponding authors.

  • Id - unique identifier
  • PaperId - id for the paper
  • AuthorId - id for the author

database.sqlite

This SQLite database contains the tables with equivalent data and formatting as the Papers.csv, Authors.csv, and PaperAuthors.csv files.

pdfs

This folder contains the raw pdf files for each of the papers.

Share link

Anyone who has the link will be able to view this.