Baselight

Countryinfo

Relevant variables that may be required in order to predict COVID's progression.

@kaggle.koryto_countryinfo

About this Dataset

Countryinfo

Greetings everyone!
I hope you find this dataset valuable for your COVID-19 models.
It is aligned with SRK's Novel Corona Virus dataset.
Feel free to upvote if you use it!

This dataset contains what I find as essential demographic information for every country specified in the submission COVID-19 competition file.
Moreover, there is additional data which is critical in my point of view in order to predict the infection rate and mortality rate per country such as the number of COVID detection tests, detection date of 'patient zero' and initial restrictions dates.
Please look at the columns description for the comprehensive explanation.

Major Insights:

  1. I've seen that there are some pretty clear distinctions between female and male mortality rate as men tend to develop more severe symptoms.
    Therefore, I added some variables which represent the sex ratio (amount of males per female) in each country, with separation by age groups & total.
    Moreover, I added lung disease data (death rate per 100k people) in each country with separation by sex as well.
  2. The average amount of children per woman has a quite high p-value when trying to analyze the trend of the confirmed cases. Especially when it comes in interaction with 'density' and school restrictions.

Citations and Data Gathering

  1. https://www.worldometers.info/ - Population, Density, Median Age, Urban Population, Fertility Rate, Patient Zero Detection Date, Confirmed Cases, New Cases, Total Deaths, Total Recovered, Critical Cases.
  2. @benhamner 's link (see acknowledgements section below) - Restrictions Initial dates.
  3. https://worldpopulationreview.com/countries/smoking-rates-by-country/ - % of smokers by country.
  4. https://data.worldbank.org/indicator/SH.MED.BEDS.ZS - Hospital beds per 1000 citizens.
  5. https://en.wikipedia.org/wiki/List_of_countries_by_sex_ratio - Sex ratio by age.
  6. https://www.worldlifeexpectancy.com/cause-of-death/lung-disease/by-country/ - Lung diseases death rate.
  7. https://en.wikipedia.org/wiki/COVID-19_testing - COVID-19 Tests
  8. https://www.worldbank.org/ - GDP 2019, Health Expenses (Whatever was missing was filled with information from Wikipedia)
  9. https://en.climate-data.org/ - Temperature and Humidity raw data.

Acknowledgements

  1. Restrictions are taken from here. Thanks to Ben Hamner for sharing this link!
  2. Special thanks to @diamondsnake for the idea of collecting the average temperature and humidity.

Good luck trying to learn more about the virus, feel free to comment and collaborate in order to collect more relevant data!

My

Share link

Anyone who has the link will be able to view this.