classify website URLs to different categories
Dataset Description
Context
This dataset was created by scraping different websites and then classifying them into different categories based on the extracted text.
Content
Below are the values each column has. The column names are pretty self-explanatory.
website_url: URL link of the website.
cleaned_website_text: the cleaned text content extracted from the
Related Datasets
-
Spam URLs Classification Dataset
@kaggle
-
Taxonomy For NIST Website
@usgov
-
Taxonomy For NIST Website
@usgov