09-17-2024 08:24 AM
I get the following error message on the attempt to use SQL Warehouse (Serverless) compute with Materialized Views (a simple interaction, e.g. DML, UI sample lookup). The MVs are created off the back of Federated Tables (Postgresql), MVs are created by DLT pipeline.
Your request failed with status FAILED: [BAD_REQUEST] [FAILED_JDBC.UNCLASSIFIED] Failed JDBC jdbc: on the operation: Failed table existence check: [....] SQLSTATE: HV000
Configuration:
The workspace is assigned to Metastore (UC), other objects e.g. Streaming Tables (DLT), stand-alone Delta Tables work with SQL Warehouse (Serverless) compute without problems. The Materialized Views are persisted on the external location (Storage Account) as other objects, so the root storage is the same for catalogs/schemas (it is metastore-level configured) - so network misconfiguration is likely ruled out.
What works?
- any other non-serverless compute does not have problems accessing MVs
Please Advise how to find a solution for this problem.
4 weeks ago
Just a little update on it before I close it. This was likely a bug that has been addressed by Databricks, as things started working as expected without any change, created MV does not have dependency on the federated table.
09-17-2024 10:52 AM
Hi @pesky_chris!
Are you otherwise able to access your federated Postgresql data via the Serverless SQL Warehouse (e.g. using a SELECT statement)?
If not, your issue may be due to networking/firewall between the Serverless SQL Warehouse and your Postgresql server.
Here is some info on configuring private connectivity for serverless: https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serv...
09-18-2024 04:59 AM
Hi @RCo
Thanks for your reply.
SQL Warehouse (Serverless) is set up to access the blob storage (on a storage account) set up with Metastore. Federated Tables come from the database hosted in the private network and are not meant to be accessed by Serverless. They work only with non-serverless compute, it is VNet Injection Databricks deployment.
I would assume that a Materialised View is effectively a persisted table with ability to refresh from the source. Serverless behaviour indicates that access to MV's underlying data source would be required to interact with the MV. This does not make sense, unless I don't understand how the MV does look like in Databricks.
09-19-2024 09:40 AM - edited 09-19-2024 09:54 AM
That is a great point!
I did some testing and I was able to re-create your issue in my own environment by doing the following:
Is this also the process you used to create the materialized view?
The documentation (https://learn.microsoft.com/en-us/azure/databricks/views/materialized) suggests that, instead of "manually" creating the DLT pipeline as outlined above, the "CREATE MATERIALIZED VIEW" statement should be ran on a pro or serverless SQL Warehouse.
I tried doing this, but encountered networking/firewall related errors for both pro and serverless warehouses.
I think this all suggests that trying to "SELECT" from a materialized view using a pro/serverless SQL Warehouse (when the materialized view create statement was not run using a pro/serverless SQL Warehouse) will implicitly try to refresh the materialized view.
09-20-2024 05:47 AM - edited 09-20-2024 05:49 AM
Hey,
To clarify, as I think I'm potentially hitting Databricks unintended "functionality".
What works:
I reckon it is unexpected behaviour. I got admin permissions on the workspace to look up __databricks_internal catalog where delta tables supporting MVs are visible. I traversed DLT's compute spool log to see how they are constructed, the data is streamed into Delta Tables and then reconciled. Furthermore, I can run DML against those technical tables in __databricks_internal (respective schemas) with Serverless without problems.
This is a fragment from the Docs :
> The underlying files supporting materialized views might include data from upstream tables (including possible personally identifiable information) that do not appear in the materialized view definition. This data is automatically added to the underlying storage to support incremental refreshing of materialized views.[...]
4 weeks ago
Just a little update on it before I close it. This was likely a bug that has been addressed by Databricks, as things started working as expected without any change, created MV does not have dependency on the federated table.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group