Skip to content

Lead scoring example

In this example, we will use a Kaggle dataset to create a model that scores the quality of customer leads. Scoring customer leads helps the sales team prioritize those which are most likely to convert into paying customers.

Setup

First, login to the Continual UI and create a new project. In this example, we'll refer to the project name as lead_scoring_example

Next, we'll want to download the dataset for this example. We'll be using the Lead Scoring Dataset from Kaggle for this example. Download the csv to your local device.

Ensure that you have installed the Continual CLI and are logged into Continual via the CLI and set lead_scoring_example as your default project.

Continual has a built-in seed function for quickly creating tables in our data warehouse straight from csv files. Execute the following in your terminal.

continual projects seed ./lead_scoring.csv

Note

The seed function should only be used for development purposes and never as a replacement for your production data pipelines.

This will upload the csv to a table called lead_scoring in the schema in which your project is using. In our example, the full path is dwh.lead_scoring_example.lead_scoring

Creating a model

Now it’s time to create a model. Our objective is to produce a score for each lead to help the sales team prioritize their efforts. Our dataset includes a list of historical leads each with a record of whether the lead converted to a paying customer or not.

In this case, we brought our features to the model itself rather than simply linking a feature set to the model. This is known as a standalone model. Typically, we recommend creating feature sets to store your features and then connecting your models via entities. That way, as feature sets are updated and modified, the model is automatically refreshed.

There are many configuration settings you can read here, but for this example we’ll keep it simple.

type: Model
name: lead_converted
description: Online course user interactions
index: prospect_id
target: converted
columns:
  - type: BOOLEAN
    name: converted
query: >
  select * from dwh.lead_scoring_example.lead_scoring

Note

Substitute your database & schema in the query above.

Once you’ve created your model.yaml file, ship it off to Continual to build, train, and predict with the following command:

continual push lead_scoring_model.yaml

Viewing the model in the Web UI

The end of the push message gives you a URL to the Continual page that tracks the system change. You can navigate there to watch the progress of the change. This is also an easy jumping off point to view the performance of the model version or the details of the prediction job.

Jump into the model version by clicking the link in the Training step. On the model version page we can look at all sorts of fun stuff such as the experiments that were run, the correlation coefficients between variables, and feature importance.

In the Model Insights tab, we see the tags variable was the most useful to our model training. Tags are status updates assigned to the lead by the sales team. Some variables had little to no impact on the model. Perhaps we can exclude those variables in the future.

This concludes our lead scoring example!

Back to top