SMS labeled messages that have been collected for mobile phone spam research
Dataset Description
SMS Spam Collection (Text Classification)
SMS labeled messages that have been collected for mobile phone spam research
Source
Huggingface Hub: link
About this dataset
The SMS Spam Collection v.1 is a set of SMS messages that have been collected and labeled as either spam or not spam. This dataset contains 5574 English, real, and non-encoded messages. The SMS messages are thought-provoking and eye-catching. The dataset is useful for mobile phone spam research
How to use the dataset
Research Ideas
- This dataset could be used to train a machine learning model to classify SMS messages as spam or not spam.
- This dataset could be used to develop a tool that can automatically identify and block spam messages.
- This dataset could be used to study the characteristics of spam messages and develop strategies for identifying and avoiding them
Acknowledgements
_This dataset is used to train a machine learning model to classify SMS messages as spam or not spam.
The SMS Spam Collection v.1 is a public set of SMS labeled messages that have been collected for mobile phone spam research. This dataset contains 5574 English, real, and non-encoded messages, tagged as being legitimate (ham) or spam. The dataset has been collected from various sources and is released under the CC BY-SA 4.0 license by Kaggle user Almeida et al._
License
> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: train.csv
| Column name | Description |
|---|---|
| sms | The text of the SMS message. (String) |
| label | The label for the SMS message, indicating whether it is ham or spam. (String) |
Related Datasets
-
Spam Text Message Classification
@kaggle
-
Eucalyptus Growth And Environmental Data
@euremarkable
-
Wars On Territory
@owid