Loan Approval Classification Dataset
Synthetic Data for binary classification on Loan Approval
@kaggle.taweilo_loan_approval_classification_data
Synthetic Data for binary classification on Loan Approval
@kaggle.taweilo_loan_approval_classification_data
This dataset is a synthetic version inspired by the original Credit Risk dataset on Kaggle and enriched with additional variables based on Financial Risk for Loan Approval data. SMOTENC was used to simulate new data points to enlarge the instances. The dataset is structured for both categorical and continuous features.
The dataset contains 45,000 records and 14 variables, each described below:
| Column | Description | Type |
|---|---|---|
person_age |
Age of the person | Float |
person_gender |
Gender of the person | Categorical |
person_education |
Highest education level | Categorical |
person_income |
Annual income | Float |
person_emp_exp |
Years of employment experience | Integer |
person_home_ownership |
Home ownership status (e.g., rent, own, mortgage) | Categorical |
loan_amnt |
Loan amount requested | Float |
loan_intent |
Purpose of the loan | Categorical |
loan_int_rate |
Loan interest rate | Float |
loan_percent_income |
Loan amount as a percentage of annual income | Float |
cb_person_cred_hist_length |
Length of credit history in years | Float |
credit_score |
Credit score of the person | Integer |
previous_loan_defaults_on_file |
Indicator of previous loan defaults | Categorical |
loan_status (target variable) |
Loan approval status: 1 = approved; 0 = rejected | Integer |
The dataset can be used for multiple purposes:
loan_status variable (approved/not approved) for potential applicants.credit_score variable based on individual and loan-related attributes.Mind the data issue from the original data, such as the instance > 100-year-old as age.
This dataset provides a rich basis for understanding financial risk factors and simulating predictive modeling processes for loan approval and credit scoring.
CREATE TABLE loan_data (
"person_age" DOUBLE,
"person_gender" VARCHAR,
"person_education" VARCHAR,
"person_income" DOUBLE,
"person_emp_exp" BIGINT,
"person_home_ownership" VARCHAR,
"loan_amnt" DOUBLE,
"loan_intent" VARCHAR,
"loan_int_rate" DOUBLE,
"loan_percent_income" DOUBLE,
"cb_person_cred_hist_length" DOUBLE,
"credit_score" BIGINT,
"previous_loan_defaults_on_file" VARCHAR,
"loan_status" BIGINT
);Anyone who has the link will be able to view this.