The goal is to forecast the spatiotemporal patterns of influenza outbreaks

By identifying influenza-related tweets, the goal is to forecast the spatiotemporal patterns of influenza outbreaks for different locations and dates.

The data is from the United States. The data comes from different states under different weeks. For each week, the task is to predict whether or not there is an influenza outbreak on the next date. More specifically, for influenza activity, there are four levels of flu activities from minimal to high according to CDC Flu Activity Map. An influenza outbreak occurrence is indicated if the activity level is high.

Variable Information

The input of the prediction task is the set of the keyword counts for all the tweets in a state in a week. The output is the occurrence of influenza outbreak for the specific state in the next week, which is zero if no event in the next week; or one, otherwise. Here are the briefs of all the variables:

'flu_locations': a list of states.
'flu_keywords': keyword list.
'flu_X_': input data for all the locations and all the weeks.
'flu_Y_': output data for all the locations and all the weeks.

525 keywords specified in the variable 'flu_keywords' in the data

Related Datasets

CDC Epidemic Trends And Rt

@cdc
Disaster Tweets, Geocoded Locations

@kaggle
Pathogens, SFA Study

@owid
Nowcast Predictions For Chikungunya Virus-Infected Travelers

@cdc
SFC2014 - REACT EU Overview Allocation Vs Decided

@esifunds
AI-Enhanced Disaster And Health Threats Storylines

@ecjrc

CDC Epidemic Trends And Rt

Disaster Tweets, Geocoded Locations

Pathogens, SFA Study

Nowcast Predictions For Chikungunya Virus-Infected Travelers

SFC2014 - REACT EU Overview Allocation Vs Decided

AI-Enhanced Disaster And Health Threats Storylines