Baselight
Sign In
kaggle

Emails For Spam Or Ham Classification (Enron 2006)

Kaggle

@kaggle.bayes2003_emails_for_spam_or_ham_classification_enron_2006

Loading...
Loading...

The Enron-Spam datasets

Dataset Description

This dataset contains emails for spam or ham classification. It's from "Enron-Spam datasets". This dataset contains 6 pre-processed(by author) form sets from Enron1 to Enron6, There are two files:

  1. email_origin.csv: Original pre-processed email with label.
    Columns:
  • label: Int type, 1 for spam and 0 for ham
  • origin: String type, original pre-processed email
  1. email_text.csv: Processed(by me) email body with label.
    Columns:
  • label: Int type, 1 for spam and 0 for ham
  • text: String type, processed email body

How I process email (from email_origin to email_text):

Email Processing

More dataset for spam or ham classification:

Emails for spam or ham classification (Trec 2007)

Emails for spam or ham classification (Trec 2006))

Emails for spam or ham classification (Trec 2005)

Emails for spam or ham classification SpamAssassin

Source:
http://nlp.cs.aueb.gr/software_and_datasets/Enron-Spam/index.html


Related Datasets

Share link

Anyone who has the link will be able to view this.