This dataset provides the 2024 Summer Olympics medal count for top-performing countries, along with their corresponding Gross Domestic Product (GDP) values for the year 2023. The dataset includes the number of gold, silver, and bronze medals won, the total number of medals, and the GDP per country.
How the data collected and processed
- After obtaining the medal data from Kaggle, the GDP data from the World Bank, and the country codes from IBAN, these datasets were merged using the country names as the main key.
- To ensure consistency, minor discrepancies in country codes between the datasets were manually corrected based on the Alpha-3 format (e.g., "US" vs. "USA").
- After merging, the combined dataset was checked for missing or erroneous values, ensuring a clean and ready-to-use dataset for analysis (except for the Refugee Olympic Team, I left the GDP and GDP year columns as
null
since they already have a bronze medal which I cannot ignore).
Challenges Faced
- Merging Datasets: the
country_code
in the Olympics data doesn't follow the standard Alpha-3 code --I found it a mix between Alpha-2 and abbreviation of the English names-- so I changed all the names to match Alpha-3 code.
- No available GDP data: the world bank data that I have right now doesn't include any information about Taiwan or North Korea. I searched for Taiwan GDP data and found it easily. But for North Korea data, I didn't find anything--North Korea’s GDP data is quite limited and often estimated due to the country’s isolated nature and lack of transparent economic reporting-- so, I searched for GDP estimate and used it.
How to use the data
This dataset can be used for analysis of the correlation between a country's economic standing (as represented by GDP per capita) and its performance in the 2024 Summer Olympics. Some potential questions that can be explored using this dataset include:
- Does a higher GDP correlate with more Olympic medals?
- Are there countries that perform better than expected based on their economic status?
Acknowledgement