Baselight

Ebay Reviews

Dataset for sentiment analysis

@kaggle.wojtekbonicki_ebay_reviews

Loading...
Loading...

About this Dataset

Ebay Reviews

Context

The dataset was created for data science bootcamp final project. The goal of the project was to built a model for sentiment analysis. The dataset was created by the author using his python web scraping scripts.

Content

Data is downloaded from ebay website
Two files were uploaded:

  1. ebay_reviews.csv - the dataset consists of 4 columns: product category (e.g. headsets, cell phones etc.), review title, review content and rating. The rating is a numerical type that can take one of the following value: 1, 2, 3, 4, 5. The value of 1 is the worst score, the value of 5 is the best score. The data is not cleaned. It need to be preprocessed for building models

  2. ebay_reviews_cleaned.csv - the dataset that is preprocessed for machine learning algorithms.
    It consists of two columns: rating column which can take one of three values:
    -1 - this is for reviews with 1,2 rating score
    0 - this is for reviews with 3 rating score
    1 - this is for reviews for 4, 5 rating score
    The second column is a connection of cleaned review title and review content. For more details see "text data cleaning using user-defined transformers" code which I wrote for this dataset

Let me know if you need the scripts for downloading ebay reviews. I will share it.

Tables

Ebay Reviews

@kaggle.wojtekbonicki_ebay_reviews.ebay_reviews
  • 5.55 MB
  • 44756 rows
  • 4 columns
Loading...

CREATE TABLE ebay_reviews (
  "category" VARCHAR,
  "review_title" VARCHAR,
  "review_content" VARCHAR,
  "rating" BIGINT
);

Ebay Reviews Cleaned

@kaggle.wojtekbonicki_ebay_reviews.ebay_reviews_cleaned
  • 3.92 MB
  • 45752 rows
  • 3 columns
Loading...

CREATE TABLE ebay_reviews_cleaned (
  "unnamed_0" BIGINT,
  "rating" BIGINT,
  "review" VARCHAR
);

Share link

Anyone who has the link will be able to view this.