Hierarchical Text Classification
Exploring approaches to text classification with structured classes
@kaggle.kashnitsky_hierarchical_text_classification
Exploring approaches to text classification with structured classes
@kaggle.kashnitsky_hierarchical_text_classification
It's interesting to explore various approaches to hierarchical text classification.
Let's start with a dataset with Amazon product reviews, classes are structured: 6 "level 1" classes, 64 "level 2" classes, and 510 "level 3" classes.
I share 3 files:
Level 1 classes are: health personal care, toys games, beauty, pet supplies, baby products, and grocery gourmet food.
Ideas to explore:
CREATE TABLE train_40k (
"productid" VARCHAR,
"title" VARCHAR,
"userid" VARCHAR,
"helpfulness" VARCHAR,
"score" DOUBLE,
"time" BIGINT,
"text" VARCHAR,
"cat1" VARCHAR,
"cat2" VARCHAR,
"cat3" VARCHAR
);CREATE TABLE unlabeled_150k (
"title" VARCHAR,
"text" VARCHAR
);CREATE TABLE val_10k (
"productid" VARCHAR,
"title" VARCHAR,
"userid" VARCHAR,
"helpfulness" VARCHAR,
"score" DOUBLE,
"time" BIGINT,
"text" VARCHAR,
"cat1" VARCHAR,
"cat2" VARCHAR,
"cat3" VARCHAR
);Anyone who has the link will be able to view this.