Read in the file and get basic information about the data, including numerical summaries.
- Describe the pdays column, make note of the mean, median and minimum values. Anything fishy in the values?
- Describe the pdays column again, this time limiting yourself to the relevant values of pdays. How different are the mean and the median values?
- Plot a horizontal bar graph with the median values of balance for each education level value. Which group has the highest median?
- Make a box plot for pdays. Do you see any outliers?
The final goal is to make a predictive model to predict if the customer will respond positively to the campaign or not. The target variable is “response”. Perform bi-variate analysis to identify the features that are directly associated with the target variable and EDA is performed.
And then both the model is compared accourding to their accuracy and concluded which algorithm is better.