Baselight

Dirty E-Commerce Data [80,000+ Products]

Practice Cleaning Dirty Data - 20+ category files and 80,000+ products

@kaggle.oleksiimartusiuk_e_commerce_data_shein

Loading...
Loading...

About this Dataset

Dirty E-Commerce Data [80,000+ Products]

E-commerce Product Dataset - Clean and Enhance Your Data Analysis Skills or Check Out The Cleaned File Below!

This dataset offers a comprehensive collection of product information from an e-commerce store, spread across 20+ CSV files and encompassing over 80,000+ products. It presents a valuable opportunity to test and refine your data cleaning and wrangling skills.

What's Included:

A variety of product categories, including:

  • Apparel & Accessories
  • Electronics
  • Home & Kitchen
  • Beauty & Health
  • Toys & Games
  • Men's Clothes
  • Women's Clothes
  • Pet Supplies
  • Sports & Outdoor
  • (and more!)

Each product record contains details such as:

  • Product Title
  • Category
  • Price
  • Discount information
  • (and other attributes)

Challenges and Opportunities:

Data Cleaning: The dataset is "dirty," containing missing values, inconsistencies in formatting, and potential errors. This provides a chance to practice your data-cleaning techniques such as:

  • Identifying and handling missing values
  • Standardizing data formats
  • Correcting inconsistencies
  • Dealing with duplicate entries

Feature Engineering: After cleaning, you can explore opportunities to create new features from the existing data, such as:

  • Extracting keywords from product titles and descriptions
  • Deriving price categories
  • Calculating average discounts

Who can benefit from this dataset?

  • Data analysts and scientists looking to practice data cleaning and wrangling skills on a real-world e-commerce dataset
  • Machine learning enthusiasts interested in building models for product recommendation, price prediction, or other e-commerce tasks
  • Anyone interested in exploring and understanding the structure and organization of product data in an e-commerce setting
  • By contributing to this dataset and sharing your cleaning and feature engineering approaches, you can help create a valuable resource for the Kaggle community!

Tables

Us Shein Appliances 3987

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_appliances_3987
  • 318.9 KB
  • 3986 rows
  • 8 columns
Loading...

CREATE TABLE us_shein_appliances_3987 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Automotive 4110

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_automotive_4110
  • 241.32 KB
  • 4109 rows
  • 6 columns
Loading...

CREATE TABLE us_shein_automotive_4110 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Baby And Maternity 4433

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_baby_and_maternity_4433
  • 220.88 KB
  • 4432 rows
  • 7 columns
Loading...

CREATE TABLE us_shein_baby_and_maternity_4433 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Bags And Luggage 4299

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_bags_and_luggage_4299
  • 390.73 KB
  • 4298 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_bags_and_luggage_4299 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Beauty And Health 4267

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_beauty_and_health_4267
  • 388.02 KB
  • 4266 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_beauty_and_health_4267 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "color_count" DOUBLE,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Curve 2849

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_curve_2849
  • 112.54 KB
  • 2848 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_curve_2849 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Electronics 4395

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_electronics_4395
  • 328.31 KB
  • 4394 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_electronics_4395 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Home And Kitchen 3719

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_home_and_kitchen_3719
  • 312.01 KB
  • 3719 rows
  • 6 columns
Loading...

CREATE TABLE us_shein_home_and_kitchen_3719 (
  "goods_title_link" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR
);

Us Shein Home Textile 3883

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_home_textile_3883
  • 257.33 KB
  • 3882 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_home_textile_3883 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "color_count" DOUBLE,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Jewelry And Accessories 3548

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_jewelry_and_accessories_3548
  • 219.42 KB
  • 3547 rows
  • 8 columns
Loading...

CREATE TABLE us_shein_jewelry_and_accessories_3548 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Kids 4314

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_kids_4314
  • 203.53 KB
  • 4313 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_kids_4314 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Mens Clothes 1891

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_mens_clothes_1891
  • 65.31 KB
  • 1890 rows
  • 7 columns
Loading...

CREATE TABLE us_shein_mens_clothes_1891 (
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR
);

Us Shein Office And School Supplies 4233

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_office_and_school_supplies_4233
  • 310.92 KB
  • 4232 rows
  • 8 columns
Loading...

CREATE TABLE us_shein_office_and_school_supplies_4233 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Pet Supplies 4083

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_pet_supplies_4083
  • 276.5 KB
  • 4082 rows
  • 11 columns
Loading...

CREATE TABLE us_shein_pet_supplies_4083 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "blackfridaybelts_bg_src" VARCHAR,
  "blackfridaybelts_content" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "discount" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Shoes 4381

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_shoes_4381
  • 246.73 KB
  • 4380 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_shoes_4381 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Sports And Outdoors 3853

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_sports_and_outdoors_3853
  • 258.81 KB
  • 3852 rows
  • 11 columns
Loading...

CREATE TABLE us_shein_sports_and_outdoors_3853 (
  "color_count" DOUBLE,
  "blackfridaybelts_bg_src" VARCHAR,
  "blackfridaybelts_content" VARCHAR,
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Swimwear 3761

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_swimwear_3761
  • 125.56 KB
  • 3760 rows
  • 8 columns
Loading...

CREATE TABLE us_shein_swimwear_3761 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "price" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Tools And Home Improvement 3903

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_tools_and_home_improvement_3903
  • 334.28 KB
  • 3902 rows
  • 8 columns
Loading...

CREATE TABLE us_shein_tools_and_home_improvement_3903 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Toys And Games 3577

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_toys_and_games_3577
  • 302.19 KB
  • 3576 rows
  • 6 columns
Loading...

CREATE TABLE us_shein_toys_and_games_3577 (
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Underwear And Sleepwear 4019

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_underwear_and_sleepwear_4019
  • 157.98 KB
  • 4018 rows
  • 9 columns
Loading...

CREATE TABLE us_shein_underwear_and_sleepwear_4019 (
  "goods_title_link_jump" VARCHAR,
  "goods_title_link_jump_href" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Womens Clothing 4620

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_womens_clothing_4620
  • 186.53 KB
  • 4619 rows
  • 8 columns
Loading...

CREATE TABLE us_shein_womens_clothing_4620 (
  "product_locatelabels_img_src" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR
);

Share link

Anyone who has the link will be able to view this.