Bank Marketing Data Set
Predicting the success of a bank's direct marketing campaign (phone calls).
@kaggle.ruthgn_bank_marketing_data_set
Predicting the success of a bank's direct marketing campaign (phone calls).
@kaggle.ruthgn_bank_marketing_data_set
This data set contains records relevant to a direct marketing campaign of a Portuguese banking institution. The marketing campaign was executed through phone calls. Often, more than one call needs to be made to a single client before they either decline or agree to a term deposit subscription. The classification goal is to predict if the client will subscribe (yes/no) to the term deposit (variable y).
This is a modified version of the classic bank marketing data set originally shared in the UCI Machine Learning Repository. There are four datasets available on UCI's repository:
This data set is a copy of data set no. 1 (bank-additional-full.csv) from the list above with one input feature (representing duration of phone call) removed. The following is a note from the variable description in the original data set:
duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no'). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model.
The duration
feature is excluded in this data set to prevent data leakage.
Input variables:
bank client data:
1 - age (numeric)
2 - job : type of job (categorical: 'admin.','blue-collar','entrepreneur','housemaid','management','retired','self-employed','services','student','technician','unemployed','unknown')
3 - marital : marital status (categorical: 'divorced','married','single','unknown'; note: 'divorced' means divorced or widowed)
4 - education (categorical: 'basic.4y','basic.6y','basic.9y','high.school','illiterate','professional.course','university.degree','unknown')
5 - default: has credit in default? (categorical: 'no','yes','unknown')
6 - housing: has housing loan? (categorical: 'no','yes','unknown')
7 - loan: has personal loan? (categorical: 'no','yes','unknown')
related with the last contact of the current campaign:
8 - contact: contact communication type (categorical: 'cellular','telephone')
9 - month: last contact month of year (categorical: 'jan', 'feb', 'mar', ..., 'nov', 'dec')
10 - day_of_week: last contact day of the week (categorical: 'mon','tue','wed','thu','fri')
other attributes:
11 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)
12 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted)
13 - previous: number of contacts performed before this campaign and for this client (numeric)
14 - poutcome: outcome of the previous marketing campaign (categorical: 'failure','nonexistent','success')
social and economic context attributes:
15 - emp.var.rate: employment variation rate - quarterly indicator (numeric)
16 - cons.price.idx: consumer price index - monthly indicator (numeric)
17 - cons.conf.idx: consumer confidence index - monthly indicator (numeric)
18 - euribor3m: euribor 3 month rate - daily indicator (numeric)
19 - nr.employed: number of employees - quarterly indicator (numeric)
Output variable (desired target):
20 - y - has the client subscribed a term deposit? (binary: 'yes','no')
Source: [Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014
Data credit goes to UCI. Visit their website to access the original data set directly: https://archive.ics.uci.edu/ml/datasets/Bank%2BMarketing
Use this data set to test the performance of your classification models and to explore the best strategies to improve a banking institution's next direct marketing campaign.
Term deposits are cash investment held at a financial institution and are a major source of revenue for banks--making them important for financial institutions to market. Telemarketing remains to be a popular marketing technique because of the potential effectiveness of human-to-human contact provided by a telephone call, which is sometimes quite the opposite of many impersonal and robotic marketing messages relayed through social and digital media. However, executing such direct marketing effort usually requires a huge investment by the business as large call centers need to be contracted to contact clients directly.
How can the banking institution have more effective direct marketing campaigns in the future? Analyze this data set and identify the patterns that will help us develop future strategies.
Anyone who has the link will be able to view this.