Baselight

Banned Book Dataset

Dataset of ~5k Banned Books + 7.5k non banned books

@kaggle.chielerli_banned_book_dataset

Loading...
Loading...

About this Dataset

Banned Book Dataset

Book bans limit access to information ad restrict freedom of expression. There has been no comprehensive data for training ML models on if a book will be censored (challenged/banned) or not. This dataset aims to address that. The title and author of banned books are obtained through non-profits like the ALA and Pen America while metadata like description and genre for them are obtained through webscraping Goodreads.
The books that are labeled as uncensored are obtained through kaggle then filtered.

Tables

Merged Dataset

@kaggle.chielerli_banned_book_dataset.merged_dataset
  • 10.66 MB
  • 17440 rows
  • 5 columns
Loading...

CREATE TABLE merged_dataset (
  "author" VARCHAR,
  "title" VARCHAR,
  "description" VARCHAR,
  "genre" VARCHAR,
  "banned" BIGINT
);

Share link

Anyone who has the link will be able to view this.