Baselight

Website Classification

classify website URLs to different categories

@kaggle.hetulmehta_website_classification

About this Dataset

Website Classification

Context

This dataset was created by scraping different websites and then classifying them into different categories based on the extracted text.

Content

Below are the values each column has. The column names are pretty self-explanatory.
website_url: URL link of the website.
cleaned_website_text: the cleaned text content extracted from the

Share link

Anyone who has the link will be able to view this.