Baselight

MIDAS Hand-Annotated News

A Corpus of Physician-Defined Topics for Data Science and Machine Learning

@kaggle.thedevastator_midas_hand_annotated_news

Loading...
Loading...

About this Dataset

MIDAS Hand-Annotated News


MIDAS Hand-Annotated News

A Corpus of Physician-Defined Topics for Data Science and Machine Learning

By [source]


About this dataset

This dataset is a hand-annotated collection of news articles covering five physician-defined topics: childhood obesity, mental health, diabetes, children in care, and infectious diseases including Coronavirus. Featuring three formats TXT, CSV and JSON file this resource stimulates groundbreaking research in data science and machine learning approaches.

The source material consists of 2020 news articles provided in TXT format for convenient use of the reader. Each article is enriched by extensive manual annotation which records up to 10 MeSH headings contained within the piece. Furthermore, the generated JSON file supports evaluation tools for classifier accuracy rating.

The dataset columns consist of x (Article ID - Integer), y (Article Text - String), z[0] (MeSH Heading 1 - String), z[1] (MeSH Heading 2 - String) through z[8] (MeSH Heading 9 - String). Unlock powerful insights with this unique resource and make an impact on knowledge discovery today!

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

For the TXT format, it is the source of all news articles from each of the dataset’s physician-defined topics. This is great for getting an overview glance at all articles with regards to each topic without needing any added information.

The CSV file contains hand-annotation of news articles on up to 10 medical subject headings (MeSH), so it is an excellent resource if you are looking to quickly extract summaries and insights into different health topics. It also helps researchers compare how multiple topics intersect in the same article over time by displaying the MeSH headings associated with each article in a convenient table view. The columns contained in this csv include: Article ID; Article Text; MeSH Heading 1 through 9; and Topic Label Used For Evaluation On Test Set, where each column contains necessary information supporting research objectives related to health-related diseases or conditions worldwide.

Finally, the JSON file provides input for evaluation purposes when using NLP predictive modeling techniques such as deep learning models to classify new articles from media related sources based on both content and labels associated with them through annotation by healthcare experts1 . The columns contained in this json include: headers which contain information about key article components corresponding values that provide additional insights about their contents - such as domains (excludes/includes sentiment analysis for example); subject heading keys/values relevant keywords/phrases used in contextually related text bodies & sections titles/headlines within an article; meta data structure & entities location identifiers - i​ ncluding geographical details regarding where physical locations featured have been cited accurately2 . Ultimately this offers researchers valuable support when developing effective machine learning models which could later be implemented into day-to-day workflows within healthcare platforms Xeonphi / AI ​ etc..​

By combining these formats together you can gain greater insight into complex areas relating not only diseases but also many other factors influencing people’s physical & mental wellbeing today3​​​ . From extrapolating summaries & tips from hand-annotated news articles or simply diving deeper into highly technical subject matters current research projects may require - there really is something here suited both novice users & experienced professionals

Research Ideas

  • Implementing a supervised machine learning model to classify the MeSH headings of news articles in order to accurately predict which medical article topics they contain.
  • Developing AI computer vision technology that can identify and analyze images in order to better categorize them according to their medical topics.
  • Leveraging this dataset to design and develop smart algorithms that can determine connections between different medical topics of articles in real-time, allowing health professionals, journalists and researchers to have access meaningful insights into healthcare trends and developments faster than ever before.

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: re_IRE.csv

Column name Description
x Article title. (String)
y Article text. (String)
z[0] MeSH heading 1. (String)
z[1] MeSH heading 2. (String)
z[2] MeSH heading 3. (String)
z[3] MeSH heading 4. (String)
z[4] MeSH heading 5. (String)
z[5] MeSH heading 6. (String)
z[6] MeSH heading 7. (String)
z[7] MeSH heading 8. (String)
z[8] MeSH heading 9. (String)

File: re_INF.csv

Column name Description
x Article title. (String)
y Article text. (String)
z[0] MeSH heading 1. (String)
z[1] MeSH heading 2. (String)
z[2] MeSH heading 3. (String)
z[3] MeSH heading 4. (String)
z[4] MeSH heading 5. (String)
z[5] MeSH heading 6. (String)
z[6] MeSH heading 7. (String)
z[7] MeSH heading 8. (String)
z[8] MeSH heading 9. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .

Tables

Eus

@kaggle.thedevastator_midas_hand_annotated_news.eus
  • 19.23 KB
  • 50 rows
  • 24 columns
Loading...

CREATE TABLE eus (
  "n" BIGINT,
  "article_name" VARCHAR,
  "mesh_heading_1" VARCHAR,
  "mesh_id_1" VARCHAR,
  "mesh_heading_2" VARCHAR,
  "mesh_id_2" VARCHAR,
  "mesh_heading_3" VARCHAR,
  "mesh_id_3" VARCHAR,
  "mesh_heading_4" VARCHAR,
  "mesh_id_4" VARCHAR,
  "mesh_heading_5" VARCHAR,
  "mesh_id_5" VARCHAR,
  "mesh_heading_6" VARCHAR,
  "mesh_id_6" VARCHAR,
  "mesh_heading_7" VARCHAR,
  "mesh_id_7" VARCHAR,
  "mesh_heading_8" VARCHAR,
  "mesh_id_8" VARCHAR,
  "mesh_heading_9" VARCHAR,
  "mesh_id_9" VARCHAR,
  "mesh_heading_10" VARCHAR,
  "mesh_id_10" VARCHAR,
  "unnamed_22" VARCHAR,
  "unnamed_23" VARCHAR
);

F1 Eus

@kaggle.thedevastator_midas_hand_annotated_news.f1_eus
  • 9.64 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE f1_eus (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

F1 Fin

@kaggle.thedevastator_midas_hand_annotated_news.f1_fin
  • 9.54 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE f1_fin (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

F1 Inf

@kaggle.thedevastator_midas_hand_annotated_news.f1_inf
  • 8.97 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE f1_inf (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

F1 Ire

@kaggle.thedevastator_midas_hand_annotated_news.f1_ire
  • 10.09 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE f1_ire (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

F1 Nir

@kaggle.thedevastator_midas_hand_annotated_news.f1_nir
  • 9.85 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE f1_nir (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Fin

@kaggle.thedevastator_midas_hand_annotated_news.fin
  • 19.88 KB
  • 51 rows
  • 22 columns
Loading...

CREATE TABLE fin (
  "n" BIGINT,
  "article_name" VARCHAR,
  "mesh_heading_1" VARCHAR,
  "mesh_id_1" VARCHAR,
  "mesh_heading_2" VARCHAR,
  "mesh_id_2" VARCHAR,
  "mesh_heading_3" VARCHAR,
  "mesh_id_3" VARCHAR,
  "mesh_heading_4" VARCHAR,
  "mesh_id_4" VARCHAR,
  "mesh_heading_5" VARCHAR,
  "mesh_id_5" VARCHAR,
  "mesh_heading_6" VARCHAR,
  "mesh_id_6" VARCHAR,
  "mesh_heading_7" VARCHAR,
  "mesh_id_7" VARCHAR,
  "mesh_heading_8" VARCHAR,
  "mesh_id_8" VARCHAR,
  "mesh_heading_9" VARCHAR,
  "mesh_id_9" VARCHAR,
  "mesh_heading_10" VARCHAR,
  "mesh_id_10" VARCHAR
);

Inf

@kaggle.thedevastator_midas_hand_annotated_news.inf
  • 18.4 KB
  • 50 rows
  • 22 columns
Loading...

CREATE TABLE inf (
  "n" BIGINT,
  "article_name" VARCHAR,
  "mesh_heading_1" VARCHAR,
  "mesh_id_1" VARCHAR,
  "mesh_heading_2" VARCHAR,
  "mesh_id_2" VARCHAR,
  "mesh_heading_3" VARCHAR,
  "mesh_id_3" VARCHAR,
  "mesh_heading_4" VARCHAR,
  "mesh_id_4" VARCHAR,
  "mesh_heading_5" VARCHAR,
  "mesh_id_5" VARCHAR,
  "mesh_heading_6" VARCHAR,
  "mesh_id_6" VARCHAR,
  "mesh_heading_7" VARCHAR,
  "mesh_id_7" VARCHAR,
  "mesh_heading_8" VARCHAR,
  "mesh_id_8" VARCHAR,
  "mesh_heading_9" VARCHAR,
  "mesh_id_9" VARCHAR,
  "mesh_heading_10" VARCHAR,
  "mesh_id_10" VARCHAR
);

Ire

@kaggle.thedevastator_midas_hand_annotated_news.ire
  • 22.32 KB
  • 20 rows
  • 26 columns
Loading...

CREATE TABLE ire (
  "n" BIGINT,
  "article_name" VARCHAR,
  "mesh_heading_1" VARCHAR,
  "mesh_id_1" VARCHAR,
  "mesh_heading_2" VARCHAR,
  "mesh_id_2" VARCHAR,
  "mesh_heading_3" VARCHAR,
  "mesh_id_3" VARCHAR,
  "mesh_heading_4" VARCHAR,
  "mesh_id_4" VARCHAR,
  "mesh_heading_5" VARCHAR,
  "mesh_id_5" VARCHAR,
  "mesh_heading_6" VARCHAR,
  "mesh_id_6" VARCHAR,
  "mesh_heading_7" VARCHAR,
  "mesh_id_7" VARCHAR,
  "mesh_heading_8" VARCHAR,
  "mesh_id_8" VARCHAR,
  "mesh_heading_9" VARCHAR,
  "mesh_id_9" VARCHAR,
  "mesh_heading_10" VARCHAR,
  "mesh_id_10" VARCHAR,
  "unnamed_22" VARCHAR,
  "unnamed_23" VARCHAR,
  "unnamed_24" VARCHAR,
  "unnamed_25" VARCHAR
);

Nir

@kaggle.thedevastator_midas_hand_annotated_news.nir
  • 17.05 KB
  • 50 rows
  • 22 columns
Loading...

CREATE TABLE nir (
  "n" BIGINT,
  "article_name" VARCHAR,
  "mesh_heading_1" VARCHAR,
  "mesh_id_1" VARCHAR,
  "mesh_heading_2" VARCHAR,
  "mesh_id_2" VARCHAR,
  "mesh_heading_3" VARCHAR,
  "mesh_id_3" VARCHAR,
  "mesh_heading_4" VARCHAR,
  "mesh_id_4" VARCHAR,
  "mesh_heading_5" VARCHAR,
  "mesh_id_5" VARCHAR,
  "mesh_heading_6" VARCHAR,
  "mesh_id_6" VARCHAR,
  "mesh_heading_7" VARCHAR,
  "mesh_id_7" VARCHAR,
  "mesh_heading_8" VARCHAR,
  "mesh_id_8" VARCHAR,
  "mesh_heading_9" VARCHAR,
  "mesh_id_9" VARCHAR,
  "mesh_heading_10" VARCHAR,
  "mesh_id_10" VARCHAR
);

Pr Eus

@kaggle.thedevastator_midas_hand_annotated_news.pr_eus
  • 9.79 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE pr_eus (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Pr Fin

@kaggle.thedevastator_midas_hand_annotated_news.pr_fin
  • 9.54 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE pr_fin (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Pr Inf

@kaggle.thedevastator_midas_hand_annotated_news.pr_inf
  • 9 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE pr_inf (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Pr Ire

@kaggle.thedevastator_midas_hand_annotated_news.pr_ire
  • 10.06 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE pr_ire (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Pr Nir

@kaggle.thedevastator_midas_hand_annotated_news.pr_nir
  • 9.85 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE pr_nir (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Re Eus

@kaggle.thedevastator_midas_hand_annotated_news.re_eus
  • 9.04 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE re_eus (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Re Fin

@kaggle.thedevastator_midas_hand_annotated_news.re_fin
  • 8.79 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE re_fin (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Re Inf

@kaggle.thedevastator_midas_hand_annotated_news.re_inf
  • 8.47 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE re_inf (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Re Ire

@kaggle.thedevastator_midas_hand_annotated_news.re_ire
  • 10.06 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE re_ire (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Re Nir

@kaggle.thedevastator_midas_hand_annotated_news.re_nir
  • 8.88 KB
  • 36 rows
  • 11 columns
Loading...

CREATE TABLE re_nir (
  "x" DOUBLE,
  "y" DOUBLE,
  "z_0" DOUBLE,
  "z_1" DOUBLE,
  "z_2" DOUBLE,
  "z_3" DOUBLE,
  "z_4" DOUBLE,
  "z_5" DOUBLE,
  "z_6" DOUBLE,
  "z_7" DOUBLE,
  "z_8" DOUBLE
);

Share link

Anyone who has the link will be able to view this.