Dataset: NLP With Disaster Tweets - Cleaning Data

About this Dataset

NLP With Disaster Tweets - Cleaning Data

Context

The data obtained by clearing the Getting Started Prediction Competition "Real or Not? NLP with Disaster Tweets" data is the result of a public notebook "NLP with Disaster Tweets - EDA and Cleaning data".
In the future, I plan to improve cleaning and update the dataset

Content

id - a unique identifier for each tweet
text - the text of the tweet
location - the location the tweet was sent from (may be blank)
keyword - a particular keyword from the tweet (may be blank)
target - in train.csv only, this denotes whether a tweet is about a real disaster (1) or not (0)

Acknowledgements

Thanks to Kaggle team for this Competition "Real or Not? NLP with Disaster Tweets" and its datasets (this dataset was created by the company figure-eight and originally shared on their ‘Data For Everyone’ website here. Tweet source: https://twitter.com/AnyOtherAnnaK/status/629195955506708480).

Thanks to web-site Ambulance services drive, strive to keep you alive for your image, which is very similar to the image of the contest "Real or Not? NLP with Disaster Tweets" and which I used as the image of my dataset

Inspiration

You are predicting whether a given tweet is about a real disaster or not. If so, predict a 1. If not, predict a 0.

Tables

Test Data Cleaning

@kaggle.vbmokin_nlp_with_disaster_tweets_cleaning_data.test_data_cleaning

253.85 KB
3263 rows
4 columns


CREATE TABLE test_data_cleaning (
  "id" BIGINT,
  "keyword" VARCHAR,
  "location" VARCHAR,
  "text" VARCHAR
);

Test Data Cleaning2

@kaggle.vbmokin_nlp_with_disaster_tweets_cleaning_data.test_data_cleaning2

253.85 KB
3263 rows
4 columns


CREATE TABLE test_data_cleaning2 (
  "id" BIGINT,
  "keyword" VARCHAR,
  "location" VARCHAR,
  "text" VARCHAR
);

Train Data Cleaning

@kaggle.vbmokin_nlp_with_disaster_tweets_cleaning_data.train_data_cleaning

545.74 KB
7613 rows
5 columns


CREATE TABLE train_data_cleaning (
  "id" BIGINT,
  "keyword" VARCHAR,
  "location" VARCHAR,
  "text" VARCHAR,
  "target" BIGINT
);

Train Data Cleaning2

@kaggle.vbmokin_nlp_with_disaster_tweets_cleaning_data.train_data_cleaning2

545.74 KB
7613 rows
5 columns


CREATE TABLE train_data_cleaning2 (
  "id" BIGINT,
  "keyword" VARCHAR,
  "location" VARCHAR,
  "text" VARCHAR,
  "target" BIGINT
);