mark_ott
Databricks Employee
Databricks Employee

The recommended way to safely update an online Databricks Feature Store without breaking the serving endpoint or causing downtime involves a version-controlled, atomic update pattern that preserves schema consistency and endpoint stability.

Key Issue

When an online feature table is deleted and recreated due to a schema change, the associated endpoint and feature spec lose binding references, rendering the endpoint unstable. Databricks currently does not support true in-place schema replacement for synced online tables — any schema change to the offline Delta source requires synchronization through a publish or merge update, not recreation.​

Recommended Approach

1. Use Incremental Schema Evolution

Databricks Delta Tables support schema evolution, allowing columns to be added or updated without deleting the table. You can use:​

python
fs.write_table( name="catalog.schema.feature_table", df=new_feature_df, mode="merge" # merges updates safely )

This approach updates the schema and data without breaking existing bindings between the offline and online tables.​

2. Republish or Refresh Features Atomically

Instead of deleting the online table, use:

python
fe.publish_table( source_table_name="catalog.schema.feature_table", online_table_name="catalog.schema.online_table", online_store=online_store, mode="merge" )

mode="merge" ensures the online table schema and data are updated incrementally while keeping its identity (and thus the endpoint bindings) intact. This prevents downtime and maintains endpoint stability.​

3. Use Lakeflow Jobs for Continuous Sync

If schema changes or feature updates are frequent, schedule Lakeflow Jobs to regularly call publish_table. This approach makes the feature update process continuous and fault-tolerant without manual deletion or recreation.​

4. Maintain Versioned Feature Specs

Databricks recommends maintaining versioned feature specifications (for example, feature_spec_v1, feature_spec_v2), while keeping a constant endpoint mapping. During deployment, update the endpoint’s configuration reference to the new spec version atomically. The endpoint name and URL remain unchanged.​

Practical Schema Evolution Workflow

  1. Update offline Delta table schema (enable CDF if not already set).

  2. Write or merge new features using schema evolution.

  3. Republish the updated offline table to the online store using mode="merge".

  4. Update the feature spec version — do not delete the online table.

  5. Redeploy endpoint referencing the new feature spec (same URL).

Summary Table

Problem Corrective Practice
Schema change causes endpoint breakage Use Delta schema evolution with mode="merge"
Need uninterrupted endpoint (stable URL) Reuse endpoint, only version feature spec
Frequent schema changes Use Lakeflow jobs for automated sync
Avoid dual tables for one offline source Use incremental publish_table to preserve online identity
 
 

This workflow ensures atomic updates, zero downtime, and endpoint continuity while enabling schema flexibility under Databricks’ Online Feature Store using Lakebase architecture.