Baselight

Spam URLs Classification Dataset

Classification of a URL if spam or not spam

@kaggle.shivamb_spam_url_prediction

About this Dataset

Spam URLs Classification Dataset

URL - Spam or Not Spam - Classification Dataset

This dataset contains about 87.5K URLs in which one-third are flagged as a spam URL and restrict are not spam. It can be used to create a binary classification model.

Credits:

The dataset was created by The Pudding. This dataset of every link is found in different newsletters. The flagging system identifies if a link is a spam or not, as it parses links from over 100 newsletters every 30 minutes. A link is programatically f flagged if it appears 3+ times in a single newsletter or contains a likely subscribe/unsubscribe URL. If you use this dataset, don't forget to cite the author.

Share link

Anyone who has the link will be able to view this.