Baselight

Goodreads Book Datasets With User Rating 2M

Every 2 days , this dataset will be updated

@kaggle.bahramjannesarr_goodreads_book_datasets_10m

Loading...
Loading...

About this Dataset

Goodreads Book Datasets With User Rating 2M

Best quot ever :

Don't ever tell anybody anything, if you do, you start missing everybody
J.D. Salinger

Story

Every one of us knows the Goodreads, and every book lovers when want to buy a book, firstly search the title of the book on this website and read all of that reviews and ratings are available there for that book.
do you know the better place for scraping data from there? tell us ba.jannesar@gmail.com or ghaderi.soroush1995@gmail.com
Goodreads one the best place for this job! 💯

These datasets are very good for two jobs :

1 . Creating book recommendation system based on 10 M books 🥇
2 . Using the Description columns for NLP 🥈

Github repo

Project link on github or here.

Content

Approximately 10,000,000 books are available on the site's archives, and these datasets are collecting from them. for requesting on the API, we used Goodreads python library,
Datasets will be updated every 2 days.

Acknowledgements

This data was entirely scrapped from the Goodreads API.

Inspiration

Do you know what is NLP? , download these datasets then upvote 💯.

Book Sample

JSON :

 {
    "Id": "5107",
    "Name": "The Catcher in the Rye",
    "RatingDist1": "1:133165",
    "RatingDist2": "2:224884",
    "RatingDist3": "3:553476",
    "RatingDist4": "4:808278",
    "RatingDist5": "5:891037",
    "pagesNumber": 277,
    "RatingDistTotal": "total:2610840",
    "PublishMonth": 30,
    "PublishDay": 1,
    "Publisher": "Back Bay Books",
    "CountsOfReview": 44046,
    "PublishYear": 2001,
    "Language": "eng",
    "Authors": "J.D. Salinger",
    "Rating": 3.8,
    "ISBN": "0316769177",
   "Count of text reviews": 55539,
    "Description": "The hero-narrator of The Catcher in the Rye is an ancient child of sixteen, a native New Yorker named Holden Caulfield. Through 
    circumstances that tend to preclude adult, secondhand description, he leaves his prep school in Pennsylvania and goes underground in New York City for 
    three days. "
 }

Or CSV :

5107,The Catcher in the Rye,1:133165,277,4:808278,total:2610840,30,1,Back Bay Books,44046,2001,eng,J.D. Salinger,3.8,2:224884,5:891037,0316769177,3:553476,55539,"The hero-narrator of The Catcher in the Rye is an ancient child of sixteen, a native New Yorker named Holden Caulfield. Through circumstances that tend to preclude adult, secondhand description, he leaves his prep school in Pennsylvania and goes underground in New York City for three days. "

Tables

User Rating 2000 To 3000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_2000_to_3000
  • 527.96 KB
  • 30633 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_2000_to_3000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

User Rating 3000 To 4000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_3000_to_4000
  • 762.46 KB
  • 46970 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_3000_to_4000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

User Rating 4000 To 5000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_4000_to_5000
  • 760.64 KB
  • 46903 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_4000_to_5000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

User Rating 5000 To 6000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_5000_to_6000
  • 282.29 KB
  • 15481 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_5000_to_6000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

User Rating 6000 To 11000

@kaggle.bahramjannesarr_goodreads_book_datasets_10m.user_rating_6000_to_11000
  • 1.46 MB
  • 127678 rows
  • 3 columns
Loading...

CREATE TABLE user_rating_6000_to_11000 (
  "id" BIGINT,
  "name" VARCHAR,
  "rating" VARCHAR
);

Share link

Anyone who has the link will be able to view this.