Databricks Community

Jake2 · ‎01-09-2024

Hey everyone,

We're making the move to Unity Catalog from Hive_Metastore and we're running into some issues performing Z-order optimizations on some of our tables.

These tables are, in either place, materialized views created with a "create or refresh live table" statement. They have the same schemas and data.

However, when we try to Z-order these Unity Catalog materialized views, we get the following message:

AnalysisException: <table_name> is a materialized view. 'OPTIMIZE' expects a table.

Given that are Hive_metastore equivalent tables are also materalized views, we're unsure why we only error out on the Unity Catalog ones. Has anyone run into this issue, and if so, how did you get around it?

Thanks in advance!

Jake2 · ‎01-19-2024

For anyone who's reading this later: You can still Z-order your materialized views, but you can't run it as a SQL command. Instead, you can set it as one of the TBLPROPERTIES when you define the table. Here's an example:

create or refresh live table {new_table}
tblproperties (
	"quality" = "gold",
	"pipelines.autoOptimize.zOrderCols" = "KEY_1,KEY_2"
)
as
select * from {old_table}

This won't Z-order things immediately, but it will do the Z-ordering during the pipeline's maintenance job. It also means that your tables won't be Z-ordered until the maintenance job has run, and to my knowledge you have little control over when it will run. This can be a problem if you're running the pipelines on a daily basis.

View solution in original post

shan_chandra · ‎01-11-2024

Hi @Jake2 - we cannot run OPTIMIZE against materialized views. Please refer to the below limitations.

https://docs.databricks.com/en/sql/user/materialized-views.html#limitations

Thanks,

Shan

Jake2 · ‎01-19-2024

Hi shan_chandra, thanks for the response.

We found that piece of documentation, but there's a couple points that we find vexing about this.

It mentions that MVs are "Unity Catalog managed tables" - we have been using them on hive_metastore for quite some time now.
The limitations do say you can't run ad-hoc OPTIMIZE commands against MVs, but again, we've been doing that for quite some time on our hive_metastore MVs.
To that point, there isn't a clear indicator as to what's ad-hoc and what isn't. If we wanted to OPTIMIZE the MVs on a non-ad-hoc basis, there isn't anything on the page that shows you how.

I feel like the MVs we have in hive_metastore and the MVs we have in Unity Catalog have fundamentally different behavior and limitations, and it's been a challenge to navigate.

Jake2 · ‎01-19-2024

For anyone who's reading this later: You can still Z-order your materialized views, but you can't run it as a SQL command. Instead, you can set it as one of the TBLPROPERTIES when you define the table. Here's an example:

create or refresh live table {new_table}
tblproperties (
	"quality" = "gold",
	"pipelines.autoOptimize.zOrderCols" = "KEY_1,KEY_2"
)
as
select * from {old_table}

This won't Z-order things immediately, but it will do the Z-ordering during the pipeline's maintenance job. It also means that your tables won't be Z-ordered until the maintenance job has run, and to my knowledge you have little control over when it will run. This can be a problem if you're running the pipelines on a daily basis.

shan_chandra · ‎01-19-2024

Hi @Jake2 - Thanks for your response. The above method to z-order as a part of the DLT pipeline maintenance job will address this. On Comparison between UC materialized views vs Hive metastore materialized views, Materialized views outside of UC are just the tables, fully recomputed. In UC, the underlying architecture is modified to allow an incremental compute.

Databricks Community

Z-Ordering a Unity Catalog Materialized View

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!