Baselight
Sign In
cdc

CDC Text Corpora For Learners: MMWR, EID, And PCD Article Metadata

@cdc.cdc_7rih_tqi5

Loading...
Loading...

CDC - National Center for State, Tribal, Local, and Territorial Public Health Infrastructure and Workforce

This landing page is part of the CDC Text Corpora for Learners program; this includes the compiled 33,576 CDC Text for Learners HTML mirrors of the MMWR Morbidity and Mortality Weekly Report including its series: Weekly Reports, Recommendations and Reports, Surveillance Summaries, Supplements, and Notifiable Diseases, a subset of Weekly Reports, constructed ad hoc; EID Emerging Infectious Diseases; and PCD Preventing Chronic Disease

The data represented here is the tabulated metadata of the combined 33,567 articles of the MMWR, EID, and PCD collections whose contents are organized into three ZIP archived JSON files per collection. The JSON value output formats include UTF-8 HTML, UTF-8 markdown, and ASCII plain text.

The JSON files are located in the program's repository. This version was constructed on 2024-03-01 using source content retrieved on 2024-01-09.

Tags: harvest-cdc-journals, corpora, corpus, ncstltphiw, phic, informatics, data science, text analysis, mmwr, eid, pcd, smokefree indoor air, ml, machine learning, language, linguistics, semantics, morphology

Last updated: 2025-07-16 13:58:47+00:00


Related Datasets

Share link

Anyone who has the link will be able to view this.