Baselight

MedicalClaimsSynthetic1M

Medical claims synthetic 1M sample

@kaggle.drscarlat_medicalclaimssynthetic1m

About this Dataset

MedicalClaimsSynthetic1M

Medicare Claims Synthetic Public Use Files (SynPUFs)

  • Medicare Claims Synthetic Public Use Files (SynPUFs) were created to allow interested parties to gain familiarity using Medicare claims data while protecting beneficiary privacy.
  • The data structure of the Medicare SynPUFs is very similar to the CMS Limited Data Sets, but with a smaller number of variables.
  • They provide data analysts and software developers the opportunity to develop programs and products utilizing the identical formats and variable names as those which appear in the actual CMS data files.

Content

From the multiple tables offered by Medicare, each with millions of rows - I've created ONE table with 1M rows and the following columns by joining the claims with the beneficiaries table:

['BENE_BIRTH_DT', 'BENE_DEATH_DT', 'BENE_SEX_IDENT_CD', 'BENE_RACE_CD',
'BENE_ESRD_IND', 'SP_STATE_CODE', 'BENE_COUNTY_CD',
'BENE_HI_CVRAGE_TOT_MONS', 'BENE_SMI_CVRAGE_TOT_MONS',
'BENE_HMO_CVRAGE_TOT_MONS', 'PLAN_CVRG_MOS_NUM', 'SP_ALZHDMTA',
'SP_CHF', 'SP_CHRNKIDN', 'SP_CNCR', 'SP_COPD', 'SP_DEPRESSN',
'SP_DIABETES', 'SP_ISCHMCHT', 'SP_OSTEOPRS', 'SP_RA_OA', 'SP_STRKETIA',
'MEDREIMB_IP', 'BENRES_IP', 'MEDREIMB_OP', 'BENRES_OP', 'PPPYMT_OP',
'MEDREIMB_CAR', 'BENRES_CAR', 'PPPYMT_CAR', 'CLM_FROM_DT',
'CLM_THRU_DT', 'ICD9_DGNS_CD_1', 'HCPCS_CD_1', 'LINE_NCH_PMT_AMT_1',
'LINE_BENE_PTB_DDCTBL_AMT_1', 'LINE_COINSRNC_AMT_1',
'LINE_PRCSG_IND_CD_1', 'LINE_ICD9_DGNS_CD_1']

Acknowledgements

https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/SynPUFs
https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/SynPUFs/Downloads/SynPUF_DUG.pdf

Share link

Anyone who has the link will be able to view this.