cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

What are differences between Materialized view and Streaming table in delta live table?

Mike_016978
New Contributor II

Hi,

I was wondering that what are differences between Materialized view and Streaming table? which one should I use when I extract data from bronze table to silver table since I found that both CREATE LIVE TABLE and CREATE STREAMING LIVE TABLE could achieve the goal and keep the data up-to-date when I set the pipeline mode to continuous?

​for my project:

  1. I first created streaming live table to ingest events into bronze tables
  2. I want to create some silver layer tables which derived from bronze tables and do some transformation.

​My questions are:

  1. Since delta live table Materialized view support incremental maintenance and update, is it better to always use CREATE LIVE TABLE in silver layer?
  2. Are there any differences between STREAMING TABLE and Materialized view ?

Thanks a lot.

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Mike Chen​ :

Materialized views are precomputed query results that are stored as tables in Delta Lake on the disk. They can be used to speed up queries that are frequently executed and have high computational cost. Materialised views are automatically updated when the underlying data changes, and can be refreshed manually using the REFRESH MATERIALIZED VIEW command. Materialised views are designed for batch processing workloads and are not suitable for streaming data. =

Streaming tables are Delta Lake tables that are continuously updated with new data from streaming sources such as Apache Kafka or Apache Spark Structured Streaming. Streaming tables use Delta Lake's transactional capabilities to ensure data consistency and reliability

View solution in original post

3 REPLIES 3

Anonymous
Not applicable

@Mike Chen​ :

Materialized views are precomputed query results that are stored as tables in Delta Lake on the disk. They can be used to speed up queries that are frequently executed and have high computational cost. Materialised views are automatically updated when the underlying data changes, and can be refreshed manually using the REFRESH MATERIALIZED VIEW command. Materialised views are designed for batch processing workloads and are not suitable for streaming data. =

Streaming tables are Delta Lake tables that are continuously updated with new data from streaming sources such as Apache Kafka or Apache Spark Structured Streaming. Streaming tables use Delta Lake's transactional capabilities to ensure data consistency and reliability

@Suteja Kanuri​ 

From my understanding When you create a materialized view, its contents reflect the state of the underlying database table or tables at that time. The data in the materialized view remains unchanged, even when applications make changes to the data in the underlying tables. To refresh the view we will need to manually refresh it is this not the case?

Anonymous
Not applicable

Hi @Mike Chen​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you.

Thank you!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group