Baselight

SMS Spam Collection (Text Classification)

SMS labeled messages that have been collected for mobile phone spam research

@kaggle.thedevastator_sms_spam_collection_a_more_diverse_dataset

Loading...
Loading...

About this Dataset

SMS Spam Collection (Text Classification)

SMS Spam Collection (Text Classification)

SMS labeled messages that have been collected for mobile phone spam research


Source

Huggingface Hub: link

About this dataset

The SMS Spam Collection v.1 is a set of SMS messages that have been collected and labeled as either spam or not spam. This dataset contains 5574 English, real, and non-encoded messages. The SMS messages are thought-provoking and eye-catching. The dataset is useful for mobile phone spam research

How to use the dataset

Research Ideas

  • This dataset could be used to train a machine learning model to classify SMS messages as spam or not spam.
  • This dataset could be used to develop a tool that can automatically identify and block spam messages.
  • This dataset could be used to study the characteristics of spam messages and develop strategies for identifying and avoiding them

Acknowledgements

_This dataset is used to train a machine learning model to classify SMS messages as spam or not spam.

The SMS Spam Collection v.1 is a public set of SMS labeled messages that have been collected for mobile phone spam research. This dataset contains 5574 English, real, and non-encoded messages, tagged as being legitimate (ham) or spam. The dataset has been collected from various sources and is released under the CC BY-SA 4.0 license by Kaggle user Almeida et al._

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: train.csv

Column name Description
sms The text of the SMS message. (String)
label The label for the SMS message, indicating whether it is ham or spam. (String)

Tables

Train

@kaggle.thedevastator_sms_spam_collection_a_more_diverse_dataset.train
  • 320.56 KB
  • 5574 rows
  • 2 columns
Loading...

CREATE TABLE train (
  "sms" VARCHAR,
  "label" BIGINT
);

Share link

Anyone who has the link will be able to view this.