Baselight

Russian Social Media Text Classification

Text classification data from VK CUP 2022

@kaggle.mikhailma_russian_social_media_text_classification

About this Dataset

Russian Social Media Text Classification

VKontakte communities can belong to one of several predefined categories. But even among the sports communities there is a fairly strong division by subject! The same authors can write about only one sport or at once about a large number.
Based on a given set of posts, determine the topic - what kind of sport is being discussed in the selected community?

Here is a list of available categories:

  1. athletics,
  2. autosport,
  3. basketball,
  4. boardgames,
  5. esport,
  6. extreme,
  7. football,
  8. hockey,
  9. martial arts,
  10. motosport,
  11. tennis,
  12. volleyball,
  13. winter_sport

evaluate metric look like:

def score(true, pred, n_samples):
    counter = 0
    if true == pred:
        counter += 1
    else:
        counter -= 1
    return counter / n_samples

Share link

Anyone who has the link will be able to view this.