Safe Update Strategy for Online Feature Store Without Endpoint Disruption

VivekWV
New Contributor II

Hi Team,

We are implementing Databricks Online Feature Store using Lakebase architecture and have run into some constraints during development:

Requirements:

  1. Deploy an offline table as a synced online table and create a feature spec that queries from this online table.
  2. During development, schema changes occur frequently (columns renamed or removed).
  3. After schema changes, we need to redeploy the endpoint with the updated online table and feature spec.

Problem: When an endpoint is running and we delete/recreate the online table and feature spec (to reflect schema changes), the endpoint breaks. In some cases, it even becomes irrecoverable.

Constraints:

  • Cannot create two online tables for the same offline table.
  • Deleting and recreating binding resources (online table + feature spec) disrupts the endpoint.
  • We need to keep a stable endpoint URL for consumers (cannot create multiple shadow endpoints).

Question: What is the recommended approach to safely update the online store and feature spec without causing downtime or breaking the endpoint? Is there a supported pattern for atomic updates or versioning in Databricks Feature Store?

Thanks for your guidance!
#lakehouse #databricksonlinefeaturestore #syncedtable #postgres #onlinefeaturestore