Skip to content

Using Continual with dbt

Overview

Using Continual on a dbt project is easy:

  1. Define Continual feature sets and models in your dbt project

  2. Execute dbt run to run your dbt project and build your data models

  3. Execute continual run to run your Continual project and build your predictive models

For an overview of how this works, refer to dbt integration.

Getting started

In order to use Continual on a dbt project, you'll need to have:

  • Registered an account in Continual
  • Installed the Continual CLI
  • Created a Continual project and connected a data warehouse to it

Refer to the quickstart for a walk through of these steps.

Choosing an integration method

If you have an existing dbt project, there are two approaches to adding Continual:

  • Create Continual feature sets and models by annotating dbt files with Continual metadata

  • Create Continual feature sets and models by adding Continual YAML definition files to your repository

Annotating dbt files with Continual metadata

Continual feature sets and models can be defined by annotating dbt files with metadata.

This approach uses dbt's model config mechanisms.

dbt meta fields can be defined in four places:

  1. The models block in the dbt_project.yml project file.
  2. Model schema files schema.yml in your models/ directory
  3. Model property files properties.yml in your models/ directory
  4. As inline config blocks directly in your dbt model SQL files

In schema or property YAML files, e.g. models/schema.yml, Continual feature sets and models are defined using a YAML syntax:

models:
  - name: customer_churn
    description: Probability a customer churns in the next 30 days.
    meta:
      continual:
        type: Model
        index: ID
        target: churn

In dbt model files, e.g. models/customer_churn.sql, Continual feature sets and models are defined using python/jinja syntax:

{{ 
  config(
    meta = {
      "continual": {
        "type": "Model",
        "index": "ID",
        "target": "churn",
      }
    }
  )
}}

SELECT ...

Annotating dbt projects with Continual YAML

Continual feature sets and models can be defined side-by-side with your dbt models by adding Continual YAML definition files to your repository.

For example, if you have an existing dbt project:

my_dbt_project/
  dbt_project.yml
  models/
    customers.sql
    customer_churn.sql

you can add Continual YAML definition files into your repository by creating a side-by-side folder:

my_dbt_project/
  dbt_project.yml
  models/
    customers.sql
  continual/
    featuresets/
      customers.yml
    models/
      customer_churn.yml

In this setup you run your dbt workflow as usual, and then run the Continual workflow via the CLI as a separate step.

For more information on working with Continual YAML and the CLI, refer to Working with the CLI.

Using dbt metadata

How you define Continual feature sets and models will depend on your chosen integration method.

If using dbt metadata in dbt model files, you'll annotate your dbt model sql files with python/jinja syntax:

{{
  config(
    materialized = "table",
    meta = {
      "continual": {
        "type": "FeatureSet",
        "entity": "Customer",
        "index": "customer_id"
      }
    }
  )  
}}
SELECT
  customer_id,
  ...

For all other integration methods, you'll define dbt model metadata with YAML syntax.

If using dbt config metadata in the dbt project file dbt_project.yml:

name: my_dbt_project

models:
  my_dbt_project:
    customers: # applies to models/customers.sql
      config:
        +meta:
          continual:
            type: FeatureSet
            entity: Customer
            index: customer_id

If using dbt config metadata in dbt model property files models/properties.yml:

version: 2
models:
  - name: customers
    config:
      meta:
        continual:
          type: FeatureSet
          entity: Customer
          index: customer_id

If using dbt metadata in dbt schema files models/schema.yml:

version: 2
models:
  - name: customers
    config:
      meta:
        continual:
          type: FeatureSet
          entity: Customer
          index: customer_id

Defining Continual resources

Defining feature sets

To define a Continual feature set based on an existing dbt model, use type = FeatureSet:

{{
  config(
    materialized = "table",
    meta = {
      "continual": {
        "type": "FeatureSet",
        "index": "customer_id"
      }
    }
  )  
}}
SELECT
  customer_id,
  ...

The only required fields are:

  • type
  • index

Note

Feature set name is also a required field, but when annotating an existing dbt model Continual will use the name of the model itself

All fields available in Continual's declarative YAML format can also be used here. For a full list of options, refer to YAML reference.

In addition to these fields, there are also dbt integration specific fields that can be defined:

  • enabled [Optional]: Whether or not to enable Continual integration for this feature set. The default value is True. Any dbt model which has this set to False will be skipped for Continual integration, even if the other parameters are set.

Defining predictive models

To define a Continual predictive model, use type = Model:

{{
  config(
    materialized = "view",
    meta = {
      "continual": {
        "type": "Model",
        "index": "customer_id",
        "target": "is_active"
      }
    }
  )  
}}
SELECT
  customer_id,
  is_active,
  ...
The required fields are:

  • type
  • index
  • target

Note

Predictive model name is also a required field, but when annotating an existing dbt model Continual will use the name of the model itself

All fields available in Continual's declarative YAML format can also be used here. For a full list of options, refer to YAML reference.

In addition to these fields, there are also dbt integration specific fields that can be defined:

  • enabled [Optional]: Whether or not to enable Continual integration for this feature set. The default value is True. Any dbt model which has this set to False will be skipped for Continual integration, even if the other parameters are set.

  • create_exposures [Optional]: Whether or not to create dbt exposure files for Continual predictive models. If used, Continual will create exposures in your dbt workflow that will link dependent tables into the Continual model. Default is False.

  • create_sources [Optional]: Whether or not to create dbt soruce files that link to Continual prediction tables. This will create a source file in your dbt project pointing to all the prediction tables created by Continual, making it easier for downstream consumption. Default is False

  • create_stub [Optional]: Whether or not to create dbt model stub files for Continual predictive model prediction tables. Using this will have Continual create files in your dbt project that reference the prediction tables built by Continual. This makes it easy to start incorporating your predictions downstream in your dbt project. Default is False

{{ config(
    meta = {
    "continual": {
      "type": "Model",
      "index": "customer_id",
      "time_index": "timestamp",
      "target": "churn",
      "split": "split",
      "description": "description",
      "columns": [
        {'name':'customer_id', 'entity': 'customers'},
        {'name':'product_id', 'entity': 'products'},
        {'name':'created_at','type': 'Timestamp'},
      ],
      "exclude_columns": ['address', 'SSN'],
      "train": {'metric': 'roc_auc', 'schedule':'@daily'},
      "promote": {'policy':'best'},
      "predict": {'incremental':True, 'schedule':'@daily'},
      "create_stub": True,
      "create_exposures": True, 
      "create_sources": True,
      "enabled": True,
    }
  })
}}

SELECT ...

Running

Running Continual with dbt is a two step process:

  • Execute dbt run
  • Execute continual run

dbt run

Execute dbt run as you would in your normal dbt workflow.

dbt run needs to be done first because:

  • dbt compiles model metadata into a manifest which Continual then uses to determine which Continual resources to create and manage

  • dbt executes on your data warehouse, creating any model tables that Continual will query as inputs to feature sets and predictive models

continual run

continual run:

  • Parses the dbt compiled manifest for Continual metadata
  • Generates Continual YAML representations of any defined feature sets and models
  • Pushes those feature set and model definitions to Continual, which results in the execution of a change plan in Continual

Continual run functions much like dbt run. It reads profile and target information directly from the dbt project:

  • --profiles-dir: The directory containing your profiles.yml file.

  • --project-dir: The dbt project directory, i.e. the directory containing your dbt_project.yml file.

  • --profile: Overrides the default profile found in dbt_project.yml.

  • --target: Overrides the default target found in profiles.yml.

Continual specific information can also be passed in to override project defaults:

  • --project: The continual project to use. Overrides the currently set project.

  • --continual-dir: The subdirectory in your dbt_project to save Continual YAML files. By default, this is set to whatever is the targets-path in the dbt_project.yml file. As a result, Continual YAML files will not be saved or versioned in git with the default dbt settings. You can modify this behavior by explicitly selecting a different directory.

Making changes

Continual configuration changes

When you make changes to the Continual meta fields in your dbt models, you'll need to tell Continual about them:

  • Execute dbt run to regenerate the dbt manifest.json file
  • If you don't want to issue a run on your data warehouse, you can optionally just execute dbt compile instead.
  • Execute continual run. The Continual CLI will parse the dbt compiled manifest, generate updated Continual YAML definitions, and push them to Continual. This will result in a change plan to be executed.

dbt model changes

When you make schema changes to dbt models which Continual relies on, you'll want to re-run your Continual feature sets and models on the new tables:

  • Execute dbt run to run the dbt models in your data warehouse
  • Execute continual run

Working with projects

dbt projects

A dbt project is a collection of dbt files associated together with a dbt_project.yml configuration file. You typically have a separate git repository for each dbt project.

A dbt project is identified by the name key in dbt_project.yml.

Continual projects

A Continual project is a collection of Continual feature sets and models. We recommend creating a separate Continual project for each dbt project you have.

A Continual project is identified by the name key you use when you log in to the Continual CLI:

continual login --email my@email.com --password secret --project my_project

If your Continual user is associated to more than one Continual project, you can specify the desired project:

For more information on creating Continual projects, refer to Projects and environments.

Working with targets and environments

dbt profiles and targets

In a typical dbt setup, you'll have:

  • One dbt project, defined in dbt_project.yml
  • The project is associated to a profile name via the profile key in dbt_project.yml
  • The profile is defined in ~/.dbt/profiles.yml
  • Note: when using dbt Cloud IDE, you do not configure profiles, but you can configure targets.
  • The profile definition has one or more targets
  • Each target uses a different data warehouse connection configuration
  • In some cases, you may have separate targets pointing to different schemas in the same database
  • In other cases, you have may separate targets pointing to the same schema names but in different databases

Continual environments

In a Continual project:

  • You have one or more Continual projects
  • Each Continual project has one or more environments configured
  • Each environment is associated to a data store configuration

Mapping targets to environments

For a Continual project used in conjunction with a dbt project, the recommended setup is:

  1. Each dbt project is associated to one Continual project
  2. Each dbt target is associated to its own Continual environment

For 2), when executing continual run, the Continual CLI determines the active dbt target by:

  • reading the default dbt target based on the dbt configuration in dbt_project.yml and ~/.dbt/profiles.yml
  • using the target provided with the continual run --target dbt_target_name flag (this matches the behaviour of dbt run: dbt run --target dbt_target_name)

It will then use the Continual environment with the same name and configuration (and will create the Continual environment with the name and configuration if it does not already exist).

The implication of this setup is that Continual will read and write to the same database and schema as dbt.

Some users may wish to keep their Continual tables in a separate schema from their dbt tables. In this case, it is recommended to create dbt targets specifically for Continual and then execute the following:

dbt run --target dev
continual run --target continual-dev

Environment workflows

Below are some sample workflows that may be common to dbt users and how continual run would be integrated into them.

Development

git clone <my-dbt-project>
cd <my-dbt-project>
git checkout -b <my-new-branch>
<modify dbt files>
dbt run --target dev
continual run --target dev
git add <modified dbt files>
git commit -a
git push

Separate production environment, no CI/CD

git clone <my-dbt-project>
cd <my-dbt-project>
git checkout -b <my-new-branch>
<modify dbt files>
dbt run --target dev
continual run --target dev
git add <modified dbt files>
git commit -a
git push
dbt run --target prod
continual run --target prod

Separate production environment, with CI/CD

git clone <my-dbt-project>
cd <my-dbt-project>
git checkout -b <my-new-branch>
<modify dbt files>
dbt run --target dev
continual run --target dev
git add <modified dbt files>
git commit -a
git push
<create pull request>

In this example, your CI/CD system will be tasked with building models and predictions in production. See our CI/CD guide for more information.

Back to top