Introduction: In our previous blog, we were able to gain deeper insights into our data through exploratory data analysis and as a supplement we were also able to create a Data Visualization project with graphs generated from the Explain feature. In this blog, we will create a model using one of the machine learning algorithms and apply it to the test data to predict customers that are likely to open Term deposits.
- Part 1: OAC Machine Learning – Let’s Get Started & Data Preparation
- Part 2: OAC Machine Learning – Gain Deeper Insights Through Visual Exploration
- Part 3: OAC Machine Learning – Knowing Your Model Performance, Simplified
Objective: Create a Machine Learning Model using one of the Binary Classifiers and apply it to the test dataset to predict new customers that would very likely subscribe to the term deposit by Modelling, Evaluation and Prediction.
We will cover our final segment- “Part 3: OAC Machine Learning – Knowing Your Model Performance, Simplified” in this section.
We will build our Machine Learning Model through a Data Flow. Let us see how to create a Data Flow in OAC.
Step 1: Create a Data Flow
Create a data flow from one or more data sets. With a data flow, produce a curated data set that you would want to feed to your algorithm and that you can use to create meaningful visualizations.
- On the Home page, click Create and select Data Flow.
Step 2: Add Labelled Training Data Set
- In the Add Data Set dialog, select a data set that we had uploaded in the first blog and click Add.
Step 3: Select Machine Learning Algorithm
- After selecting the data set click on ‘+’ symbol to see the various options
- In the third section of the above image, we have the option to choose one of the following available classifiers
- Train Numeric Prediction
- Train Multi-Classifier
- Train Binary- Classifier
- Train Clustering
- Since our target is the column ‘Outcome of the marketing event’, a binary variable that says yes or no (whether the customer subscribed to a term deposit or not) we will treat it as a binary classification problem.
- Let’s click on the ‘Train Binary classifier’ and proceed.
- Next we will select one among the below Binary classification algorithms to be used to predict customers.
- We will be using ‘Random Forest for model training’ Classification Model.
Step 4: Designate Target for Prediction
- In this step we need to designate the target column for prediction. Select the Target column ‘Outcome of the marketing event’.
Step 5: Give the Training Model a Name
- When this data flow is executed it will create a training model. Next step is to give the training model a name (i.e., Random Forest Model).
Step 6: Save and Execute the Data Flow to Create, Train and Test the Model
- Now we need to save the data flow and then execute it to ‘train the model’.
- The model is created by successful execution of the data flow.
Now we have a trained ML model for predicting bank customers. Next we will evaluate, analyze the quality and key metrics of this model.
Click on Machine Learning tab from Home page and find the Binary Classification model (i.e., Random Forest Model) that we have created.
- Click on Inspect, to view the Quality details that include accuracy metrics like model accuracy, precision, recall, F1 value, false positive rate, etc.,
- Since it is a classification model we can see the confusion matrix.
A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is generated.
Here the accuracy of the model is 76%. We can fine tune model parameters further or choose other algorithms and achieve better results.
- To test the model for new costumers, you can upload the data set of new customers and create a new Data Flow as we created before.
- Instead of selecting a Machine Learning Model, you could select ‘Apply Model’ and select model that you have created.
- Then select ‘Save Data’ and give the name for the data set.
- Now save, run this data flow and the data set with the prediction value for New customers will be generated.
Based on the Trained Model, we can use the PredictedValue of ‘yes’ and contact customers that are very likely to subscribe. This helps Financial institutions predict customers that are likely to open Term deposits after a marketing campaign which in turn will help the marketing team with contacting the desired customers.
The above can lead to many other great ways to analysz and present data using machine learning and predictive analytics. For more information go to www.appsassociates.com.
#OACS #Analytics #MachineLearning #Oracle #SmartFeatures
#DataScience #ExploratoryDataAnalysis #DataVisualisations
Tejaswini Uppu is one of our Analytics Associate based in Hyderabad, India and has been with Apps Associates for more than 2 years working on multiple Analytics technologies spanning Oracle, Informatica, Snowflake and Python.