Baselight

Banned Book Dataset

Dataset of ~5k Banned Books + 7.5k non banned books

@kaggle.chielerli_banned_book_dataset

About this Dataset

Banned Book Dataset

Book bans limit access to information ad restrict freedom of expression. There has been no comprehensive data for training ML models on if a book will be censored (challenged/banned) or not. This dataset aims to address that. The title and author of banned books are obtained through non-profits like the ALA and Pen America while metadata like description and genre for them are obtained through webscraping Goodreads.
The books that are labeled as uncensored are obtained through kaggle then filtered.

Share link

Anyone who has the link will be able to view this.