Baselight

German Credit History

financial and banking details for customers

@kaggle.ashrafkhan94_german_credit_history

Loading...
Loading...

About this Dataset

German Credit History

The German credit dataset describes financial and banking details for customers and the task is to determine whether the customer is good or bad. The assumption is that the task involves predicting whether a customer will pay back a loan or credit. The dataset includes 1,000 examples and 20 input variables, 7 of which are numerical (integer) and 13 are categorical.
􏰀 Status of existing checking account
􏰀 Duration in month
􏰀 Credit history
􏰀 Purpose
􏰀 Credit amount
􏰀 Savings account
􏰀 Present employment since
􏰀 Installment rate in percentage of disposable income
􏰀 Personal status and sex
􏰀 Other debtors
􏰀 Present residence since
􏰀 Property
􏰀 Age in years
􏰀 Other installment plans
􏰀 Housing
􏰀 Number of existing credits at this bank
􏰀 Job
􏰀 Number of dependents
􏰀 Telephone
􏰀 Foreign worker

Some of the categorical variables have an ordinal relationship, such as Savings account, although most do not. There are two outcome classes, 1 for good customers and 2 for bad customers. Good customers are the default or negative class, whereas bad customers are the exception or positive class. A total of 70 percent of the examples are good customers, whereas the remaining 30 percent of examples are bad customers.
􏰀 Good Customers: Negative or majority class (70%).
􏰀 Bad Customers: Positive or minority class (30%).

A cost matrix is provided with the dataset that gives a different penalty to each misclas- sification error for the positive class. Specifically, a cost of five is applied to a false negative (marking a bad customer as good) and a cost of one is assigned for a false positive (marking a
good customer as bad).
􏰀 Cost for False Negative: 5
􏰀 Cost for False Positive: 1
This suggests that the positive class is the focus of the prediction task and that it is more costly to the bank or financial institution to give money to a bad customer than to not give money to a good customer. This must be taken into account when selecting a performance metric.

Tables

German

@kaggle.ashrafkhan94_german_credit_history.german
  • 24.16 KB
  • 999 rows
  • 21 columns
Loading...

CREATE TABLE german (
  "a11" VARCHAR,
  "n_6" BIGINT,
  "a34" VARCHAR,
  "a43" VARCHAR,
  "n_1169" BIGINT,
  "a65" VARCHAR,
  "a75" VARCHAR,
  "n_4" BIGINT,
  "a93" VARCHAR,
  "a101" VARCHAR,
  "n_4_1" BIGINT,
  "a121" VARCHAR,
  "n_67" BIGINT,
  "a143" VARCHAR,
  "a152" VARCHAR,
  "n_2" BIGINT,
  "a173" VARCHAR,
  "n_1" BIGINT,
  "a192" VARCHAR,
  "a201" VARCHAR,
  "n_1_1" BIGINT
);

Share link

Anyone who has the link will be able to view this.