Context
Workers' compensation is a form of insurance providing wage replacement and medical benefits to employees injured in the course of employment. In exchange for this coverage, the employee has to relinquish his right to sue his employer in the case of an incident. The system of collective liability was created to prevent employers from becoming insolvent as a result of high damage awards, and thus to ensure the security of compensation to the workers. Individual immunity is the necessary corollary to collective liability.
Content
The data.csv file contains 54,000 insurance policies that you can use to train and validate your model.
Data fields
- ClaimNumber: Unique policy identifier
- DateTimeOfAccident: Date and time of accident
- DateReported: Date that accident was reported
- Age: Age of worker
- Gender: Gender of worker
- MaritalStatus: Martial status of worker. (M)arried, (S)ingle, (U)nknown.
- DependentChildren: The number of dependent children
- DependentsOther: The number of dependants excluding children
- WeeklyWages: Total weekly wage
- PartTimeFullTime: Binary (P) or (F)
- HoursWorkedPerWeek: Total hours worked per week
- DaysWorkedPerWeek: Number of days worked per week
- ClaimDescription: 12 continuous variables pointing out to useful keywords
- InitialIncurredClaimCost: Initial estimate by the insurer of the claim cost
- UltimateIncurredClaimCost: Total claims payments by the insurance company. This is the field you are asked to predict.
Acknowledgements
The data is fully synthetic and not specific to any legal jurisdiction or country. It has been created by Colin Priest for an in-class competition organized by the Actuaries Institute of Australia, Institute and Faculty of Actuaries and the Singapore Actuarial Society.
Inspiration
Using the data can you build a predictive model and validate it?