Baselight

Disaster Tweets, Geocoded Locations

Geocoded locations for the Real or Not? NLP with Disaster Tweets competition

@kaggle.herwinvw_disaster_tweets_geocoded_locations

Loading...
Loading...

About this Dataset

Disaster Tweets, Geocoded Locations

Context

Trying to make use of the location feature in the "Real or Not? NLP with Disaster Tweets" competition.
I tried to geocode the locations, hoping that at least the difference between locations that can be geocoded (e.g. Birmingham) vs those that cannot be (e.g. "your sisters bedroom") would be a good feature. Additionally, geocoding provides longitude and latitude features that may be helpful.

Content

The dataset captures whether a location could be geocoded (that is: it is a valid location in the world).

Acknowledgements

Geocoding is done with Nominatim

Inspiration

Can you make better tweet classifications with geocoded locations?

Tables

Test Geocodes

@kaggle.herwinvw_disaster_tweets_geocoded_locations.test_geocodes
  • 42.03 KB
  • 3263 rows
  • 5 columns
Loading...

CREATE TABLE test_geocodes (
  "id" BIGINT,
  "has_location" BOOLEAN,
  "geocoded" BOOLEAN,
  "longitude" DOUBLE,
  "latitude" DOUBLE
);

Train Geocodes

@kaggle.herwinvw_disaster_tweets_geocoded_locations.train_geocodes
  • 88.99 KB
  • 7613 rows
  • 5 columns
Loading...

CREATE TABLE train_geocodes (
  "id" BIGINT,
  "has_location" BOOLEAN,
  "geocoded" BOOLEAN,
  "longitude" DOUBLE,
  "latitude" DOUBLE
);

Share link

Anyone who has the link will be able to view this.