Russian Social Media Text Classification
Text classification data from VK CUP 2022
@kaggle.mikhailma_russian_social_media_text_classification
Text classification data from VK CUP 2022
@kaggle.mikhailma_russian_social_media_text_classification
VKontakte communities can belong to one of several predefined categories. But even among the sports communities there is a fairly strong division by subject! The same authors can write about only one sport or at once about a large number.
Based on a given set of posts, determine the topic - what kind of sport is being discussed in the selected community?
Here is a list of available categories:
evaluate metric look like:
def score(true, pred, n_samples):
counter = 0
if true == pred:
counter += 1
else:
counter -= 1
return counter / n_samples
CREATE TABLE sample_submission (
"oid" BIGINT,
"category" VARCHAR
);CREATE TABLE test (
"oid" BIGINT,
"text" VARCHAR
);CREATE TABLE train (
"oid" BIGINT,
"category" VARCHAR,
"text" VARCHAR
);Anyone who has the link will be able to view this.