Baselight

Goodreads Book Datasets With User Rating 2M

Every 2 days , this dataset will be updated

@kaggle.bahramjannesarr_goodreads_book_datasets_10m

Loading...
Loading...

About this Dataset

Goodreads Book Datasets With User Rating 2M

Best quot ever :

Don't ever tell anybody anything, if you do, you start missing everybody
J.D. Salinger

Story

Every one of us knows the Goodreads, and every book lovers when want to buy a book, firstly search the title of the book on this website and read all of that reviews and ratings are available there for that book.
do you know the better place for scraping data from there? tell us ba.jannesar@gmail.com or ghaderi.soroush1995@gmail.com
Goodreads one the best place for this job! 💯

These datasets are very good for two jobs :

1 . Creating book recommendation system based on 10 M books 🥇
2 . Using the Description columns for NLP 🥈

Github repo

Project link on github or here.

Content

Approximately 10,000,000 books are available on the site's archives, and these datasets are collecting from them. for requesting on the API, we used Goodreads python library,
Datasets will be updated every 2 days.

Acknowledgements

This data was entirely scrapped from the Goodreads API.

Inspiration

Do you know what is NLP? , download these datasets then upvote 💯.

Book Sample

JSON :

 {
    "Id": "5107",
    "Name": "The Catcher in the Rye",
    "RatingDist1": "1:133165",
    "RatingDist2": "2:224884",
    "RatingDist3": "3:553476",
    "RatingDist4": "4:808278",
    "RatingDist5": "5:891037",
    "pagesNumber": 277,
    "RatingDistTotal": "total:2610840",
    "PublishMonth": 30,
    "PublishDay": 1,
    "Publisher": "Back Bay Books",
    "CountsOfReview": 44046,
    "PublishYear": 2001,
    "Language": "eng",
    "Authors": "J.D. Salinger",
    "Rating": 3.8,
    "ISBN": "0316769177",
   "Count of text reviews": 55539,
    "Description": "The hero-narrator of The Catcher in the Rye is an ancient child of sixteen, a native New Yorker named Holden Caulfield. Through 
    circumstances that tend to preclude adult, secondhand description, he leaves his prep school in Pennsylvania and goes underground in New York City for 
    three days. "
 }

Or CSV :

5107,The Catcher in the Rye,1:133165,277,4:808278,total:2610840,30,1,Back Bay Books,44046,2001,eng,J.D. Salinger,3.8,2:224884,5:891037,0316769177,3:553476,55539,"The hero-narrator of The Catcher in the Rye is an ancient child of sixteen, a native New Yorker named Holden Caulfield. Through circumstances that tend to preclude adult, secondhand description, he leaves his prep school in Pennsylvania and goes underground in New York City for three days. "

Tables

Book1000k 1100k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1000k_1100k
  • 20.35 MB
  • 39705 rows
  • 20 columns
Loading...

CREATE TABLE book1000k_1100k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book100k 200k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book100k_200k
  • 4.32 MB
  • 57046 rows
  • 18 columns
Loading...

CREATE TABLE book100k_200k (
  "pagesnumber" BIGINT,
  "authors" VARCHAR,
  "publisher" VARCHAR,
  "rating" DOUBLE,
  "language" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist3" VARCHAR,
  "countsofreview" BIGINT,
  "publishday" BIGINT,
  "isbn" VARCHAR,
  "ratingdist4" VARCHAR,
  "publishmonth" BIGINT,
  "id" BIGINT,
  "publishyear" BIGINT,
  "ratingdist1" VARCHAR,
  "ratingdist2" VARCHAR,
  "name" VARCHAR
);

Book1–100k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1_100k
  • 4.35 MB
  • 58292 rows
  • 18 columns
Loading...

CREATE TABLE book1_100k (
  "id" BIGINT,
  "name" VARCHAR,
  "ratingdist1" VARCHAR,
  "pagesnumber" BIGINT,
  "ratingdist4" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "countsofreview" BIGINT,
  "publishyear" BIGINT,
  "language" VARCHAR,
  "authors" VARCHAR,
  "rating" DOUBLE,
  "ratingdist2" VARCHAR,
  "ratingdist5" VARCHAR,
  "isbn" VARCHAR,
  "ratingdist3" VARCHAR
);

Book1100k 1200k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1100k_1200k
  • 20.94 MB
  • 41892 rows
  • 20 columns
Loading...

CREATE TABLE book1100k_1200k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book1200k 1300k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1200k_1300k
  • 21.25 MB
  • 43622 rows
  • 20 columns
Loading...

CREATE TABLE book1200k_1300k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book1300k 1400k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1300k_1400k
  • 18.27 MB
  • 38288 rows
  • 20 columns
Loading...

CREATE TABLE book1300k_1400k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book1400k 1500k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1400k_1500k
  • 16.8 MB
  • 34759 rows
  • 20 columns
Loading...

CREATE TABLE book1400k_1500k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book1500k 1600k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1500k_1600k
  • 16.06 MB
  • 33439 rows
  • 20 columns
Loading...

CREATE TABLE book1500k_1600k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book1600k 1700k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1600k_1700k
  • 15.6 MB
  • 32986 rows
  • 20 columns
Loading...

CREATE TABLE book1600k_1700k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book1700k 1800k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1700k_1800k
  • 15.14 MB
  • 32105 rows
  • 19 columns
Loading...

CREATE TABLE book1700k_1800k (
  "authors" VARCHAR,
  "countsofreview" BIGINT,
  "description" VARCHAR,
  "isbn" VARCHAR,
  "id" BIGINT,
  "language" VARCHAR,
  "name" VARCHAR,
  "publishday" BIGINT,
  "publishmonth" BIGINT,
  "publishyear" BIGINT,
  "publisher" VARCHAR,
  "rating" DOUBLE,
  "ratingdist1" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "pagesnumber" BIGINT
);

Book1800k 1900k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1800k_1900k
  • 17.69 MB
  • 38863 rows
  • 19 columns
Loading...

CREATE TABLE book1800k_1900k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR
);

Book1900k 2000k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book1900k_2000k
  • 19.36 MB
  • 43561 rows
  • 19 columns
Loading...

CREATE TABLE book1900k_2000k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR
);

Book2000k 3000k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book2000k_3000k
  • 169.35 MB
  • 395957 rows
  • 19 columns
Loading...

CREATE TABLE book2000k_3000k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR
);

Book200k 300k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book200k_300k
  • 4.27 MB
  • 56182 rows
  • 18 columns
Loading...

CREATE TABLE book200k_300k (
  "publisher" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "ratingdist5" VARCHAR,
  "publishday" BIGINT,
  "name" VARCHAR,
  "rating" DOUBLE,
  "pagesnumber" BIGINT,
  "language" VARCHAR,
  "publishmonth" BIGINT,
  "id" BIGINT,
  "ratingdist4" VARCHAR,
  "ratingdist1" VARCHAR,
  "isbn" VARCHAR,
  "ratingdist2" VARCHAR,
  "countsofreview" BIGINT,
  "authors" VARCHAR,
  "ratingdist3" VARCHAR,
  "publishyear" BIGINT
);

Book3000k 4000k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book3000k_4000k
  • 101.26 MB
  • 256595 rows
  • 19 columns
Loading...

CREATE TABLE book3000k_4000k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR
);

Book300k 400k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book300k_400k
  • 4.3 MB
  • 56586 rows
  • 18 columns
Loading...

CREATE TABLE book300k_400k (
  "ratingdist4" VARCHAR,
  "ratingdist1" VARCHAR,
  "isbn" VARCHAR,
  "authors" VARCHAR,
  "id" BIGINT,
  "pagesnumber" BIGINT,
  "language" VARCHAR,
  "ratingdist3" VARCHAR,
  "name" VARCHAR,
  "publishyear" BIGINT,
  "countsofreview" BIGINT,
  "ratingdist5" VARCHAR,
  "publishmonth" BIGINT,
  "ratingdist2" VARCHAR,
  "publishday" BIGINT,
  "ratingdisttotal" VARCHAR,
  "rating" DOUBLE,
  "publisher" VARCHAR
);

Book4000k 5000k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book4000k_5000k
  • 103.86 MB
  • 280256 rows
  • 19 columns
Loading...

CREATE TABLE book4000k_5000k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR
);

Book400k 500k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book400k_500k
  • 4.21 MB
  • 55155 rows
  • 18 columns
Loading...

CREATE TABLE book400k_500k (
  "publishyear" BIGINT,
  "rating" DOUBLE,
  "ratingdisttotal" VARCHAR,
  "isbn" VARCHAR,
  "ratingdist1" VARCHAR,
  "publisher" VARCHAR,
  "publishmonth" BIGINT,
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "publishday" BIGINT,
  "ratingdist2" VARCHAR,
  "pagesnumber" BIGINT,
  "ratingdist3" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR
);

Book500k 600k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book500k_600k
  • 4.17 MB
  • 54859 rows
  • 18 columns
Loading...

CREATE TABLE book500k_600k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT
);

Book600k 700k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book600k_700k
  • 26.38 MB
  • 55156 rows
  • 19 columns
Loading...

CREATE TABLE book600k_700k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR
);

Book700k 800k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book700k_800k
  • 25.71 MB
  • 54273 rows
  • 20 columns
Loading...

CREATE TABLE book700k_800k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book800k 900k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book800k_900k
  • 25.28 MB
  • 49843 rows
  • 20 columns
Loading...

CREATE TABLE book800k_900k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

Book900k 1000k

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.book900k_1000k
  • 21.1 MB
  • 40890 rows
  • 20 columns
Loading...

CREATE TABLE book900k_1000k (
  "id" BIGINT,
  "name" VARCHAR,
  "authors" VARCHAR,
  "isbn" VARCHAR,
  "rating" DOUBLE,
  "publishyear" BIGINT,
  "publishmonth" BIGINT,
  "publishday" BIGINT,
  "publisher" VARCHAR,
  "ratingdist5" VARCHAR,
  "ratingdist4" VARCHAR,
  "ratingdist3" VARCHAR,
  "ratingdist2" VARCHAR,
  "ratingdist1" VARCHAR,
  "ratingdisttotal" VARCHAR,
  "countsofreview" BIGINT,
  "language" VARCHAR,
  "pagesnumber" BIGINT,
  "description" VARCHAR,
  "count_of_text_reviews" BIGINT
);

User Rating 0 To 1000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_0_to_1000
  • 735.62 KB
  • 51945 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_0_to_1000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

User Rating 1000 To 2000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_1000_to_2000
  • 790.65 KB
  • 42986 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_1000_to_2000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

Share link

Anyone who has the link will be able to view this.