Baselight

FIFA 21 Messy, Raw Dataset For Cleaning/ Exploring

FIFA 21 complete players' messy and raw dataset

@kaggle.yagunnersya_fifa_21_messy_raw_dataset_for_cleaning_exploring

Loading...
Loading...

About this Dataset

FIFA 21 Messy, Raw Dataset For Cleaning/ Exploring

Context

Kaggle is notorious for providing pure, clean datasets ready for analysis and model building.

So here I present to you a veeeeery messy and raw dataset of EA Sports' latest installment of their hit FIFA series - FIFA21, which I scraped from sofifa.com

Content

One of the challenges of web scraping is unclean data, and it natural, really. Different front-end developers write the HTML their own way, and that makes the incoming data unpredictable.

You'll definitely learn a lot about data cleaning with this dataset.

Acknowledgements

A huge round of applause for sofifa.com for providing this amazing data!

Inspiration

  1. Convert the height and weight columns to numerical forms
  2. Remove the unnecessary newline characters from all columns that have them.
  3. Based on the 'Joined' column, check which players have been playing at a club for more than 10 years!
  4. 'Value', 'Wage' and "Release Clause' are string columns. Convert them to numbers. For eg, "M" in value column is Million, so multiply the row values by 1,000,000, etc.
  5. Some columns have 'star' characters. Strip those columns of these stars and make the columns numerical
  6. Which players are highly valuable but still underpaid (on low wages)? (hint: scatter plot between wage and value)

Ask more questions yourself !

Hope it helps! :) If you like this dataset, please show your support by upvoting this dataset! Thanks! :)

Tables

Fifa21 Raw Data

@kaggle.yagunnersya_fifa_21_messy_raw_dataset_for_cleaning_exploring.fifa21_raw_data
  • 2.52 MB
  • 18979 rows
  • 77 columns
Loading...

CREATE TABLE fifa21_raw_data (
  "photourl" VARCHAR,
  "longname" VARCHAR,
  "playerurl" VARCHAR,
  "nationality" VARCHAR,
  "positions" VARCHAR,
  "name" VARCHAR,
  "age" BIGINT,
  "n__ova" BIGINT,
  "pot" BIGINT,
  "team_contract" VARCHAR,
  "id" BIGINT,
  "height" VARCHAR,
  "weight" VARCHAR,
  "foot" VARCHAR,
  "bov" BIGINT,
  "bp" VARCHAR,
  "growth" BIGINT,
  "joined" TIMESTAMP,
  "loan_date_end" TIMESTAMP,
  "value" VARCHAR,
  "wage" VARCHAR,
  "release_clause" VARCHAR,
  "attacking" BIGINT,
  "crossing" BIGINT,
  "finishing" BIGINT,
  "heading_accuracy" BIGINT,
  "short_passing" BIGINT,
  "volleys" BIGINT,
  "skill" BIGINT,
  "dribbling" BIGINT,
  "curve" BIGINT,
  "fk_accuracy" BIGINT,
  "long_passing" BIGINT,
  "ball_control" BIGINT,
  "movement" BIGINT,
  "acceleration" BIGINT,
  "sprint_speed" BIGINT,
  "agility" BIGINT,
  "reactions" BIGINT,
  "balance" BIGINT,
  "power" BIGINT,
  "shot_power" BIGINT,
  "jumping" BIGINT,
  "stamina" BIGINT,
  "strength" BIGINT,
  "long_shots" BIGINT,
  "mentality" BIGINT,
  "aggression" BIGINT,
  "interceptions" BIGINT,
  "positioning" BIGINT,
  "vision" BIGINT,
  "penalties" BIGINT,
  "composure" BIGINT,
  "defending" BIGINT,
  "marking" BIGINT,
  "standing_tackle" BIGINT,
  "sliding_tackle" BIGINT,
  "goalkeeping" BIGINT,
  "gk_diving" BIGINT,
  "gk_handling" BIGINT,
  "gk_kicking" BIGINT,
  "gk_positioning" BIGINT,
  "gk_reflexes" BIGINT,
  "total_stats" BIGINT,
  "base_stats" BIGINT,
  "w_f" VARCHAR,
  "sm" VARCHAR,
  "a_w" VARCHAR,
  "d_w" VARCHAR,
  "ir" VARCHAR,
  "pac" BIGINT,
  "sho" BIGINT,
  "pas" BIGINT,
  "dri" BIGINT,
  "def" BIGINT,
  "phy" BIGINT,
  "hits" VARCHAR
);

Fifa21 Raw Data V2

@kaggle.yagunnersya_fifa_21_messy_raw_dataset_for_cleaning_exploring.fifa21_raw_data_v2
  • 2.44 MB
  • 18979 rows
  • 77 columns
Loading...

CREATE TABLE fifa21_raw_data_v2 (
  "id" BIGINT,
  "name" VARCHAR,
  "longname" VARCHAR,
  "photourl" VARCHAR,
  "playerurl" VARCHAR,
  "nationality" VARCHAR,
  "age" BIGINT,
  "n__ova" BIGINT,
  "pot" BIGINT,
  "club" VARCHAR,
  "contract" VARCHAR,
  "positions" VARCHAR,
  "height" VARCHAR,
  "weight" VARCHAR,
  "preferred_foot" VARCHAR,
  "bov" BIGINT,
  "best_position" VARCHAR,
  "joined" TIMESTAMP,
  "loan_date_end" TIMESTAMP,
  "value" VARCHAR,
  "wage" VARCHAR,
  "release_clause" VARCHAR,
  "attacking" BIGINT,
  "crossing" BIGINT,
  "finishing" BIGINT,
  "heading_accuracy" BIGINT,
  "short_passing" BIGINT,
  "volleys" BIGINT,
  "skill" BIGINT,
  "dribbling" BIGINT,
  "curve" BIGINT,
  "fk_accuracy" BIGINT,
  "long_passing" BIGINT,
  "ball_control" BIGINT,
  "movement" BIGINT,
  "acceleration" BIGINT,
  "sprint_speed" BIGINT,
  "agility" BIGINT,
  "reactions" BIGINT,
  "balance" BIGINT,
  "power" BIGINT,
  "shot_power" BIGINT,
  "jumping" BIGINT,
  "stamina" BIGINT,
  "strength" BIGINT,
  "long_shots" BIGINT,
  "mentality" BIGINT,
  "aggression" BIGINT,
  "interceptions" BIGINT,
  "positioning" BIGINT,
  "vision" BIGINT,
  "penalties" BIGINT,
  "composure" BIGINT,
  "defending" BIGINT,
  "marking" BIGINT,
  "standing_tackle" BIGINT,
  "sliding_tackle" BIGINT,
  "goalkeeping" BIGINT,
  "gk_diving" BIGINT,
  "gk_handling" BIGINT,
  "gk_kicking" BIGINT,
  "gk_positioning" BIGINT,
  "gk_reflexes" BIGINT,
  "total_stats" BIGINT,
  "base_stats" BIGINT,
  "w_f" VARCHAR,
  "sm" VARCHAR,
  "a_w" VARCHAR,
  "d_w" VARCHAR,
  "ir" VARCHAR,
  "pac" BIGINT,
  "sho" BIGINT,
  "pas" BIGINT,
  "dri" BIGINT,
  "def" BIGINT,
  "phy" BIGINT,
  "hits" VARCHAR
);

Share link

Anyone who has the link will be able to view this.