Baselight

PubMed MultiLabel Text Classification Dataset MeSH

Extreme Multi Label Text Classification on Biomedical PubMed Articles

@kaggle.owaiskhan9654_pubmed_multilabel_text_classification

Loading...
Loading...

About this Dataset

PubMed MultiLabel Text Classification Dataset MeSH

This dataset consists of an approx 50k collection of research articles from PubMed repository. Originally these documents are manually annotated by Biomedical Experts with their MeSH labels and each article are described in terms of 10-15 MeSH labels. In this Dataset we have huge numbers of labels present as a MeSH major, raising the issue of extremely large output space and severe label sparsity issues. To solve this issue, the Dataset has been Processed and mapped to its root as described below.

Tables

Pubmed Multi Label Text Classification Dataset

@kaggle.owaiskhan9654_pubmed_multilabel_text_classification.pubmed_multi_label_text_classification_dataset
  • 11.77 MB
  • 10,000 rows
  • 22 columns
Loading...
CREATE TABLE pubmed_multi_label_text_classification_dataset (
  "title" VARCHAR,
  "abstracttext" VARCHAR,
  "meshmajor" VARCHAR,
  "pmid" BIGINT,
  "meshid" VARCHAR,
  "meshroot" VARCHAR,
  "a" BIGINT,
  "b" BIGINT,
  "c" BIGINT,
  "d" BIGINT,
  "e" BIGINT,
  "f" BIGINT,
  "g" BIGINT,
  "h" BIGINT,
  "i" BIGINT,
  "j" BIGINT,
  "k" BIGINT,
  "l" BIGINT,
  "m" BIGINT,
  "n" BIGINT,
  "v" BIGINT,
  "z" BIGINT
);

Pubmed Multi Label Text Classification Dataset Processed

@kaggle.owaiskhan9654_pubmed_multilabel_text_classification.pubmed_multi_label_text_classification_dataset_processed
  • 59.11 MB
  • 50,000 rows
  • 20 columns
Loading...
CREATE TABLE pubmed_multi_label_text_classification_dataset_processed (
  "title" VARCHAR,
  "abstracttext" VARCHAR,
  "meshmajor" VARCHAR,
  "pmid" BIGINT,
  "meshid" VARCHAR,
  "meshroot" VARCHAR,
  "a" BIGINT,
  "b" BIGINT,
  "c" BIGINT,
  "d" BIGINT,
  "e" BIGINT,
  "f" BIGINT,
  "g" BIGINT,
  "h" BIGINT,
  "i" BIGINT,
  "j" BIGINT,
  "l" BIGINT,
  "m" BIGINT,
  "n" BIGINT,
  "z" BIGINT
);

Share link

Anyone who has the link will be able to view this.