dbt integration¶
Design Partners Wanted
Continual's dbt and dbt Cloud integration is undergoing active development and evolution. To be part of our design partner program, learn more about our dbt roadmap, or give general feedback, please email support@continual.ai
Overview¶
Many modern data teams use dbt to model the data of their business. They may have one or more dbt models to represent entities in their business, and each of those models contains columns (or attributes, or features) of these entities.
Continual integrates with dbt by allowing dbt users to define entities, feature sets, and predictive models directly from their existing dbt models. This is done by using dbt's built-in meta
config metadata mechanism.
This integration is intended to give dbt users a lightweight way to
start leveraging their dbt assets in Continual without having to re-do data
transformation work that is already done in dbt. This integration also provides
a compatible workflow to the existing dbt workflow: users can execute
continual run
on a dbt project to have Continual process it and build out the
required resources.
Define¶
Feature sets¶
Let's say you have a customers
model, via the dbt Getting Started guide:
{{
config(
materialized = "table"
)
}}
...
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from final
This particular model has:
- An entity id:
customer_id
- Two attributes related to the customer's name:
first_name
,last_name
- Three attributes related to the customer's order history:
first_order_date
,most_recent_order_date
,number_of_orders
We can instruct Continual about this entity and its features simply by adding some additional metadata to the dbt model config:
{{
config(
materialized = "table",
meta = {
"continual": {
"type": "FeatureSet",
"entity": "Customer",
"index": "customer_id"
}
}
)
}}
...
This defines a feature set in Continual:
- Name:
customers
(derived from the dbt model name) - Entity:
Customer
(defined in the metadata) - Index (identifier):
customer_id
(defined in the metadata) - Features: (inferred based on the columns in the dbt model)
first_name
last_name
first_order_date
most_recent_order_date
number_of_orders
In your dbt project, you will likely have many dbt models containing different attributes for a customer. By defining a Continual feature set for each of these dbt models and associating them to the same Customer
entity, Continual will be able to use features from all of the dbt models as inputs to predictive models.
For more information on feature sets, refer to Feature sets and entities.
Predictive models¶
Predictive models can be defined in the same way as feature sets, using dbt metadata.
When defining a predictive model in an existing dbt project, we recommend creating a new dbt model to represent each prediction target.
Continuing our customers
example from above, let's say that there is an is_active
field that we'd like to use to predict customer active status with. We'll create a new dbt model customer_active_status.sql
:
{{
config(
materialized = "view",
meta = {
"continual": {
"type": "Model",
"description": "Predict customer active status",
"index": "customer_id",
"target": "is_active",
"columns": [
{"name": "customer_id", "entity": "Customer"}
]
}
}
)
}}
SELECT
customer_id,
is_active
FROM
{{ ref("customers") }}
This defines a predictive model in Continual.
For more information on how to configure predictive models, refer to Models and model versions.
Run¶
Once we've defined Continual feature sets and predictive models in our dbt models, we can execute Continual directly on top of the dbt project:
dbt run
continual run
continual run
will:
- Parse compiled dbt models and extract Continual definitions
- Transparently create Continual YAML representations of these definitions
- Push these definitions to Continual
For a complete reference of command line options, refer to Working with dbt.
Workflows¶
Continual fits in to your existing dbt workflow:
continual run
can use your existing dbt profiles and targets to connect to your data warehouse- You can create separate Continual projects for each of your dbt projects
- You can create separate Continual environments to match your separate dbt profiles and targets as needed
continual run
can be used in the same way asdbt run
in your git/GitOps, CI, and CD workflows
Examples¶
For additional examples, refer to Guided examples.
For full details on integrating Continual with dbt, refer to Working with dbt.