Materialized view resources created after creation query failed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-18-2025 06:16 AM
Hello, Im testing with Materialized Views and I ran the following query:
create materialized view `xx`.`xx`.`stg_sales`
PARTITIONED BY (id, external_id, tenant_id)
TBLPROPERTIES ('cluster_by' = 'id') as
with stg_accounts_events as (
select
*
from
`xx`.`xx`.`snapshot_account_events`
),
cast_columns as (
select
id,
external_id,
tenant_id
from
stg_accounts_events
)
select
*
from
cast_columns
This query was 'redacted' and it was a very heavy query on its real context. We have query max timeout set to 10 minutes and this query took longer than 10 minutes it did timeout.
The only log I had was that the query timed out.
With all that said, a materialized view was created with a name __staging__acounts_{some hash id}__ which I did not create, im assuming its some 'behind the scenes' resource created by databricks. Also a DLT job was created and occurred in costs for some days.
Im aware the DLT is how MVs gets refreshed behind the scenes, the issue here was that my query did not work and my MV was not created, however the managed databricks resources were created and I had no way of knowing this without querying system tables and finding MVs that I did not purposely created.
Is this a bug? if not and this is the expected behaviour, is this documented somewhere making clear that I have to cleanup resources created by databricks myself when my query do not work?
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-18-2025 09:19 AM
Hi Victor,
How are doing today?, As per my understanding, Yeah, this seems like expected behavior, but it's definitely not well-documented. When you create a Materialized View (MV) in Databricks, it automatically sets up a DLT (Delta Live Tables) pipeline behind the scenes to manage and refresh the view. Even if the query times out or fails, Databricks still creates internal resources, like the MV metadata (__staging__accounts_{hash}__) and the DLT job, which can continue running and incurring costs unless manually cleaned up. Unfortunately, there isn’t a built-in way to detect failed MV creations easily.
To avoid this in the future, you might want to test your query separately before using it in an MV and monitor system tables (information_schema.materialized_views) for any unexpected MVs. If you notice unwanted resources, you’ll need to manually drop the MV and stop the DLT job. It would be great if Databricks had a clearer way to notify users when an MV creation fails but still leaves behind resources—maybe worth raising a support ticket for better documentation. Let me know if you need help cleaning up those resources!
Regards,
Brahma
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2025 07:33 AM
Thank you for the very complete answer!
I keep changing my mind between bug and expected behavior, maybe if it printed out on the error logs that resources were created I would be convinced that this is how it works.
I will raise a support ticket like you have mentioned, thank you Brahma
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2025 08:07 AM
You are always welcome, Vitor. Do let me know for any queries. Good day.

