hendrik
Databricks Employee
Databricks Employee

An approach that works well when using a Databricks SQL Warehouse is to use the replace_where strategy - I've just tested this. It also works with partitioned tables:

{{ config(
   materialized='incremental',
   incremental_strategy='replace_where',
   incremental_predicates=["date in (" ~ var('date', '1999-12-31') ~ ")"],
   partition_by=['date']
) }}

select * from {{ ref('table_upstream') }}
where date in ({{ var("date", "1999-12-31") }})

which is compiled into the following SQL

insert into `table_downstream`
  replace where
    date in ('2023-01-03', '2023-01-02')
  table `table_downstream__dbt_tmp`

and run it with the following filter predicates dynamically passed in via CLI:

dbt run --vars "{\"date\": \"'2023-01-03', '2023-01-02'\"}"

I'd recommend to avoid using insert_overwrite with DBSQL warehouses. Also, omitting partition_by=['date'] causes the entire table to be overwritten. While recoverable via time travel and RESTORE, it’s risky if not intended.