Banned Book Dataset
Dataset of ~5k Banned Books + 7.5k non banned books
@kaggle.chielerli_banned_book_dataset
Dataset of ~5k Banned Books + 7.5k non banned books
@kaggle.chielerli_banned_book_dataset
Book bans limit access to information ad restrict freedom of expression. There has been no comprehensive data for training ML models on if a book will be censored (challenged/banned) or not. This dataset aims to address that. The title and author of banned books are obtained through non-profits like the ALA and Pen America while metadata like description and genre for them are obtained through webscraping Goodreads.
The books that are labeled as uncensored are obtained through kaggle then filtered.
CREATE TABLE merged_dataset (
"author" VARCHAR,
"title" VARCHAR,
"description" VARCHAR,
"genre" VARCHAR,
"banned" BIGINT
);
Anyone who has the link will be able to view this.