COMPAS Recidivism Racial Bias
Racial Bias in inmate COMPAS reoffense risk scores for Florida (ProPublica)
@kaggle.danofer_compass
Racial Bias in inmate COMPAS reoffense risk scores for Florida (ProPublica)
@kaggle.danofer_compass
COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is a popular commercial algorithm used by judges and parole officers for scoring criminal defendant’s likelihood of reoffending (recidivism). It has been shown that the algorithm is biased in favor of white defendants, and against black inmates, based on a 2 year follow up study (i.e who actually committed crimes or violent crimes after 2 years). The pattern of mistakes, as measured by precision/sensitivity is notable.
Quoting from ProPublica:
"
Black defendants were often predicted to be at a higher risk of recidivism than they actually were. Our analysis found that black defendants who did not recidivate over a two-year period were nearly twice as likely to be misclassified as higher risk compared to their white counterparts (45 percent vs. 23 percent).
White defendants were often predicted to be less risky than they were. Our analysis found that white defendants who re-offended within the next two years were mistakenly labeled low risk almost twice as often as black re-offenders (48 percent vs. 28 percent).
The analysis also showed that even when controlling for prior crimes, future recidivism, age, and gender, black defendants were 45 percent more likely to be assigned higher risk scores than white defendants.
Data contains variables used by the COMPAS algorithm in scoring defendants, along with their outcomes within 2 years of the decision, for over 10,000 criminal defendants in Broward County, Florida.
3 subsets of the data are provided, including a subset of only violent recividism (as opposed to, e.g. being reincarcerated for non violent offenses such as vagrancy or Marijuana).
Indepth analysis by ProPublica can be found in their data methodology article.
Data & original analysis gathered by ProPublica.
Original Data methodology article:
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Original Article:
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Original data from ProPublica:
https://github.com/propublica/compas-analysis
Additional "simple" subset provided by FairML, based on the proPublica data:
http://blog.fastforwardlabs.com/2017/03/09/fairml-auditing-black-box-predictive-models.html
Ideas:
CREATE TABLE compas_scores_raw (
"person_id" BIGINT,
"assessmentid" BIGINT,
"case_id" BIGINT,
"agency_text" VARCHAR,
"lastname" VARCHAR,
"firstname" VARCHAR,
"middlename" VARCHAR,
"sex_code_text" VARCHAR,
"ethnic_code_text" VARCHAR,
"dateofbirth" TIMESTAMP,
"scaleset_id" BIGINT,
"scaleset" VARCHAR,
"assessmentreason" VARCHAR,
"language" VARCHAR,
"legalstatus" VARCHAR,
"custodystatus" VARCHAR,
"maritalstatus" VARCHAR,
"screening_date" VARCHAR,
"recsupervisionlevel" BIGINT,
"recsupervisionleveltext" VARCHAR,
"scale_id" BIGINT,
"displaytext" VARCHAR,
"rawscore" DOUBLE,
"decilescore" BIGINT,
"scoretext" VARCHAR,
"assessmenttype" VARCHAR,
"iscompleted" BIGINT,
"isdeleted" BIGINT
);CREATE TABLE cox_violent_parsed (
"id" DOUBLE,
"name" VARCHAR,
"first" VARCHAR,
"last" VARCHAR,
"compas_screening_date" TIMESTAMP,
"sex" VARCHAR,
"dob" TIMESTAMP,
"age" BIGINT,
"age_cat" VARCHAR,
"race" VARCHAR,
"juv_fel_count" BIGINT,
"decile_score" BIGINT,
"juv_misd_count" BIGINT,
"juv_other_count" BIGINT,
"priors_count" BIGINT,
"days_b_screening_arrest" DOUBLE,
"c_jail_in" VARCHAR,
"c_jail_out" VARCHAR,
"c_case_number" VARCHAR,
"c_offense_date" TIMESTAMP,
"c_arrest_date" TIMESTAMP,
"c_days_from_compas" DOUBLE,
"c_charge_degree" VARCHAR,
"c_charge_desc" VARCHAR,
"is_recid" BIGINT,
"r_case_number" VARCHAR,
"r_charge_degree" VARCHAR,
"r_days_from_arrest" DOUBLE,
"r_offense_date" TIMESTAMP,
"r_charge_desc" VARCHAR,
"r_jail_in" TIMESTAMP,
"r_jail_out" TIMESTAMP,
"violent_recid" VARCHAR,
"is_violent_recid" BIGINT,
"vr_case_number" VARCHAR,
"vr_charge_degree" VARCHAR,
"vr_offense_date" TIMESTAMP,
"vr_charge_desc" VARCHAR,
"type_of_assessment" VARCHAR,
"decile_score_1" BIGINT,
"score_text" VARCHAR,
"screening_date" TIMESTAMP,
"v_type_of_assessment" VARCHAR,
"v_decile_score" BIGINT,
"v_score_text" VARCHAR,
"v_screening_date" TIMESTAMP,
"in_custody" TIMESTAMP,
"out_custody" TIMESTAMP,
"priors_count_1" BIGINT,
"start" BIGINT,
"end" BIGINT,
"event" BIGINT
);CREATE TABLE cox_violent_parsed_filt (
"id" DOUBLE,
"name" VARCHAR,
"first" VARCHAR,
"last" VARCHAR,
"sex" VARCHAR,
"dob" TIMESTAMP,
"age" BIGINT,
"age_cat" VARCHAR,
"race" VARCHAR,
"juv_fel_count" BIGINT,
"decile_score" BIGINT,
"juv_misd_count" BIGINT,
"juv_other_count" BIGINT,
"priors_count" BIGINT,
"days_b_screening_arrest" DOUBLE,
"c_jail_in" VARCHAR,
"c_jail_out" VARCHAR,
"c_days_from_compas" DOUBLE,
"c_charge_degree" VARCHAR,
"c_charge_desc" VARCHAR,
"is_recid" BIGINT,
"r_charge_degree" VARCHAR,
"r_days_from_arrest" DOUBLE,
"r_offense_date" TIMESTAMP,
"r_charge_desc" VARCHAR,
"r_jail_in" TIMESTAMP,
"violent_recid" VARCHAR,
"is_violent_recid" BIGINT,
"vr_charge_degree" VARCHAR,
"vr_offense_date" TIMESTAMP,
"vr_charge_desc" VARCHAR,
"type_of_assessment" VARCHAR,
"decile_score_1" BIGINT,
"score_text" VARCHAR,
"screening_date" TIMESTAMP,
"v_type_of_assessment" VARCHAR,
"v_decile_score" BIGINT,
"v_score_text" VARCHAR,
"priors_count_1" BIGINT,
"event" BIGINT
);Anyone who has the link will be able to view this.