cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Build a streaming table on top of a snowflake table

abelian-grape
New Contributor III

Is it possible to create a streaming table on top of a snowflake tablel accessible via the Lakehouse Federation?

3 REPLIES 3

Brahmareddy
Honored Contributor III

Hi abelian-grape,

How are you doing today?, As per my understanding, Right now, it’s not possible to create a streaming table directly on top of a Snowflake table that’s accessed through Lakehouse Federation in Databricks. Lakehouse Federation allows you to run read-only queries on external databases like Snowflake using Unity Catalog, but it doesn’t support streaming or write operations. So if you need streaming, the best approach is to ingest the data from Snowflake into a Delta table first, and then create a streaming pipeline on top of that Delta table. You could use scheduled batch jobs or tools like Fivetran, dbt, or custom logic with JDBC to pull data regularly from Snowflake into your Bronze layer, and from there, build your streaming logic. Let me know if you want help setting that up!

Regards,

Brahma

Hey @Brahmareddy, I ended up creating a Delta table as a mirror of the source Snowflake table (accessed via Lakehouse Federation). I set up logic to append only new records to the Delta table based on a timestamp column—so only records where the timestamp is greater than the current max get added

Then I use readStream in append mode to write those new records to a staging Delta table. The downstream process picks up from this staging table—so for example, it processes new items like 3, 4, 5—and then I delete the processed records from the staging table to ensure only new data gets handled incrementally.

What do you think of this approach? Am I overcomplicating it?

Brahmareddy
Honored Contributor III

Hi abelian-grape,

Honestly, that’s a really solid and practical approach—you’re not overcomplicating it at all. Since Lakehouse Federation doesn't support streaming or change tracking directly, creating a mirror Delta table that pulls new records based on a timestamp is a great workaround. Using readStream in append mode on that Delta table gives you streaming-like behavior, and having a staging table to feed the downstream process and then clean up processed records keeps things tidy and avoids reprocessing. It might feel a bit manual, but this pattern gives you control, reliability, and flexibility, especially when true streaming isn’t available. If it works well for your volume and latency needs, I’d say you’re on the right track! Let me know if you’d like ideas on automating the cleanup or tracking offsets in a more reusable way. 

Regards,

Brahma

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now