Baselight

Dirty E-Commerce Data [80,000+ Products]

Practice Cleaning Dirty Data - 20+ category files and 80,000+ products

@kaggle.oleksiimartusiuk_e_commerce_data_shein

Loading...
Loading...

About this Dataset

Dirty E-Commerce Data [80,000+ Products]

E-commerce Product Dataset - Clean and Enhance Your Data Analysis Skills or Check Out The Cleaned File Below!

This dataset offers a comprehensive collection of product information from an e-commerce store, spread across 20+ CSV files and encompassing over 80,000+ products. It presents a valuable opportunity to test and refine your data cleaning and wrangling skills.

What's Included:

A variety of product categories, including:

  • Apparel & Accessories
  • Electronics
  • Home & Kitchen
  • Beauty & Health
  • Toys & Games
  • Men's Clothes
  • Women's Clothes
  • Pet Supplies
  • Sports & Outdoor
  • (and more!)

Each product record contains details such as:

  • Product Title
  • Category
  • Price
  • Discount information
  • (and other attributes)

Challenges and Opportunities:

Data Cleaning: The dataset is "dirty," containing missing values, inconsistencies in formatting, and potential errors. This provides a chance to practice your data-cleaning techniques such as:

  • Identifying and handling missing values
  • Standardizing data formats
  • Correcting inconsistencies
  • Dealing with duplicate entries

Feature Engineering: After cleaning, you can explore opportunities to create new features from the existing data, such as:

  • Extracting keywords from product titles and descriptions
  • Deriving price categories
  • Calculating average discounts

Who can benefit from this dataset?

  • Data analysts and scientists looking to practice data cleaning and wrangling skills on a real-world e-commerce dataset
  • Machine learning enthusiasts interested in building models for product recommendation, price prediction, or other e-commerce tasks
  • Anyone interested in exploring and understanding the structure and organization of product data in an e-commerce setting
  • By contributing to this dataset and sharing your cleaning and feature engineering approaches, you can help create a valuable resource for the Kaggle community!

Tables

Us Shein Appliances 3987

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_appliances_3987
  • 326.56 kB
  • 3,986 rows
  • 8 columns
Loading...
CREATE TABLE us_shein_appliances_3987 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Automotive 4110

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_automotive_4110
  • 247.11 kB
  • 4,109 rows
  • 6 columns
Loading...
CREATE TABLE us_shein_automotive_4110 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Baby And Maternity 4433

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_baby_and_maternity_4433
  • 226.18 kB
  • 4,432 rows
  • 7 columns
Loading...
CREATE TABLE us_shein_baby_and_maternity_4433 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Bags And Luggage 4299

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_bags_and_luggage_4299
  • 400.11 kB
  • 4,298 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_bags_and_luggage_4299 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Beauty And Health 4267

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_beauty_and_health_4267
  • 397.33 kB
  • 4,266 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_beauty_and_health_4267 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "color_count" DOUBLE,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Curve 2849

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_curve_2849
  • 115.24 kB
  • 2,848 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_curve_2849 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Electronics 4395

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_electronics_4395
  • 336.19 kB
  • 4,394 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_electronics_4395 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Home And Kitchen 3719

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_home_and_kitchen_3719
  • 319.5 kB
  • 3,719 rows
  • 6 columns
Loading...
CREATE TABLE us_shein_home_and_kitchen_3719 (
  "goods_title_link" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR
);

Us Shein Home Textile 3883

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_home_textile_3883
  • 263.5 kB
  • 3,882 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_home_textile_3883 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "color_count" DOUBLE,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Jewelry And Accessories 3548

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_jewelry_and_accessories_3548
  • 224.69 kB
  • 3,547 rows
  • 8 columns
Loading...
CREATE TABLE us_shein_jewelry_and_accessories_3548 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Kids 4314

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_kids_4314
  • 208.41 kB
  • 4,313 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_kids_4314 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Mens Clothes 1891

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_mens_clothes_1891
  • 66.87 kB
  • 1,890 rows
  • 7 columns
Loading...
CREATE TABLE us_shein_mens_clothes_1891 (
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR
);

Us Shein Office And School Supplies 4233

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_office_and_school_supplies_4233
  • 318.38 kB
  • 4,232 rows
  • 8 columns
Loading...
CREATE TABLE us_shein_office_and_school_supplies_4233 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Pet Supplies 4083

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_pet_supplies_4083
  • 283.14 kB
  • 4,082 rows
  • 11 columns
Loading...
CREATE TABLE us_shein_pet_supplies_4083 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "blackfridaybelts_bg_src" VARCHAR,
  "blackfridaybelts_content" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "discount" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Shoes 4381

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_shoes_4381
  • 252.65 kB
  • 4,380 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_shoes_4381 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Sports And Outdoors 3853

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_sports_and_outdoors_3853
  • 265.02 kB
  • 3,852 rows
  • 11 columns
Loading...
CREATE TABLE us_shein_sports_and_outdoors_3853 (
  "color_count" DOUBLE,
  "blackfridaybelts_bg_src" VARCHAR,
  "blackfridaybelts_content" VARCHAR,
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Swimwear 3761

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_swimwear_3761
  • 128.58 kB
  • 3,760 rows
  • 8 columns
Loading...
CREATE TABLE us_shein_swimwear_3761 (
  "color_count" DOUBLE,
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "price" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Tools And Home Improvement 3903

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_tools_and_home_improvement_3903
  • 342.31 kB
  • 3,902 rows
  • 8 columns
Loading...
CREATE TABLE us_shein_tools_and_home_improvement_3903 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Toys And Games 3577

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_toys_and_games_3577
  • 309.45 kB
  • 3,576 rows
  • 6 columns
Loading...
CREATE TABLE us_shein_toys_and_games_3577 (
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "selling_proposition" VARCHAR,
  "discount" VARCHAR,
  "goods_title_link" VARCHAR
);

Us Shein Underwear And Sleepwear 4019

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_underwear_and_sleepwear_4019
  • 161.78 kB
  • 4,018 rows
  • 9 columns
Loading...
CREATE TABLE us_shein_underwear_and_sleepwear_4019 (
  "goods_title_link_jump" VARCHAR  -- Goods-title-link--jump,
  "goods_title_link_jump_href" VARCHAR  -- Goods-title-link--jump Href,
  "selling_proposition" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR
);

Us Shein Womens Clothing 4620

@kaggle.oleksiimartusiuk_e_commerce_data_shein.us_shein_womens_clothing_4620
  • 191.01 kB
  • 4,619 rows
  • 8 columns
Loading...
CREATE TABLE us_shein_womens_clothing_4620 (
  "product_locatelabels_img_src" VARCHAR,
  "color_count" DOUBLE,
  "goods_title_link" VARCHAR,
  "rank_title" VARCHAR,
  "rank_sub" VARCHAR,
  "price" VARCHAR,
  "discount" VARCHAR,
  "selling_proposition" VARCHAR
);

Share link

Anyone who has the link will be able to view this.