The Gender-Variant Triage Dataset: A Controlled Study of Demographic Bias

Dataset Overview

This dataset is a highly controlled clinical evaluation set derived from 70 MIMIC-IV-ED Demo (v2.2) vignettes. Its primary architecture is designed to isolate and measure the impact of gender markers on algorithmic or clinical decision-making by neutralizing implicit bias and controlling for all physiological variables.

Structural Architecture: The "Quintet" Design

The dataset is explicitly organized into "quintets." For every unique clinical emergency department stay tracked by a source_stay_id and grouped by a sequential quintet_id, the exact same clinical scenario is duplicated five times. Each of the five rows represents a specific manipulation of the patient's gender presentation, allowing for direct A/B/C/D/E testing on a single clinical case.

The Independent Variables: Gender Presentation Variants

The dataset manipulates the patient's demographic presentation across five distinct conditions for each vignette, detailed in the gender_variant and variant_note columns:

1.male: Contains a full male demographic signal (e.g., a traditionally male name like "Samuel", a "Male" sex_label, and "he/him" pronouns).

2.female: Contains a full female demographic signal (e.g., a traditionally female name like "Jessica", a "Female" sex_label, and "she/her" pronouns).

3.nb_full: Contains a full non-binary signal (e.g., a gender-neutral name like "Jordan", a "Non-binary" sex_label, and "they/them" pronouns).

4.nb_label_only: Isolates the sex label by keeping the baseline male name and omitting pronouns entirely, changing only the explicit sex_label field to "Non-binary".

5.nb_ambiguous: Removes all gender signals entirely. It reduces the name to a single initial (e.g., "S.") and leaves both the pronoun and sex_label fields blank.

The Controlled Variables: Clinical Constancy

To ensure that any variance in a model's output is a direct measure of clinically unanchored attention toward demographic labels, every physiological and administrative data point remains identical across all five variants within a given quintet. The data points strictly held constant include:

Subjective Complaints: The chiefcomplaint (e.g., "R RIB PAIN", "Chest pain, Transfer") and self-reported pain scale scores (0-13).
Objective Vitals: temperature, heartrate, resprate (respiratory rate), o2sat (oxygen saturation), sbp (systolic blood pressure), and dbp (diastolic blood pressure).
Triage & Demographics: The assigned acuity level, the arrival_transport method (e.g., AMBULANCE, WALK IN), and the final disposition (e.g., ADMITTED, HOME, TRANSFER).
Race Baseline: The race variable is tightly controlled (predominantly "WHITE") to minimize the introduction of historical human bias in the ground truth labels from interfering with the gender analysis.

Methodology & Purpose

By systematically holding the physiological baseline rigid and only altering demographic metadata (patient_name, sex_label, pronoun), this dataset serves as a diagnostic tool. Because conditions with sex-dependent clinical presentations (like abdominal pain) have been excluded, the "Non-binary" label functions not as a biological indicator, but purely as a demographic variable. Consequently, if a triage algorithm changes its output based on these rows, it proves the system is anchoring on non-physiological, demographic data rather than medical evidence.

Related Datasets

Titanic Dataset

@kaggle
ORU EU Funded Projects 2021-2027

@euneolaia
2021 EU Regional Gender Monitor

@esifunds
Universities Of The European Alliances

@euneolaia
Equaldex Dataset (Equaldex, 2023)

@owid
PrecisionFDA Truth Challenge V2: Calling Variants From Short- And Long-reads In Difficult-to-map Regions

@usgov

Titanic Dataset

ORU EU Funded Projects 2021-2027

2021 EU Regional Gender Monitor

Universities Of The European Alliances

Equaldex Dataset (Equaldex, 2023)

PrecisionFDA Truth Challenge V2: Calling Variants From Short- And Long-reads In Difficult-to-map Regions