cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with SQL Warehouse (Serverless)

pesky_chris
New Contributor III

I get the following error message on the attempt to use SQL Warehouse (Serverless) compute with Materialized Views (a simple interaction, e.g. DML, UI sample lookup). The MVs are created off the back of Federated Tables (Postgresql), MVs are created by DLT pipeline.

Your request failed with status FAILED: [BAD_REQUEST] [FAILED_JDBC.UNCLASSIFIED] Failed JDBC jdbc: on the operation: Failed table existence check: [....] SQLSTATE: HV000

Configuration:

The workspace is assigned to Metastore (UC), other objects e.g. Streaming Tables (DLT), stand-alone Delta Tables work with SQL Warehouse (Serverless) compute without problems. The Materialized Views are persisted on the external location (Storage Account) as other objects, so the root storage is the same for catalogs/schemas (it is metastore-level configured) - so network misconfiguration is likely ruled out.

What works?

- any other non-serverless compute does not have problems accessing MVs

Please Advise how to find a solution for this problem.

 

1 ACCEPTED SOLUTION

Accepted Solutions

Just a little update on it before I close it. This was likely a bug that has been addressed by Databricks, as things started working as expected without any change, created MV does not have dependency on the federated table.

View solution in original post

5 REPLIES 5

RCo
New Contributor III

Hi @pesky_chris!

Are you otherwise able to access your federated Postgresql data via the Serverless SQL Warehouse (e.g. using a SELECT statement)?

If not, your issue may be due to networking/firewall between the Serverless SQL Warehouse and your Postgresql server.

Here is some info on configuring private connectivity for serverless: https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serv...

pesky_chris
New Contributor III

Hi @RCo 

Thanks for your reply.


SQL Warehouse (Serverless) is set up to access the blob storage (on a storage account) set up with Metastore. Federated Tables come from the database hosted in the private network and are not meant to be accessed by Serverless. They work only with non-serverless compute, it is VNet Injection Databricks deployment.

I would assume that a Materialised View is effectively a persisted table with ability to refresh from the source. Serverless behaviour indicates that access to MV's underlying data source would be required to interact with the MV. This does not make sense, unless I don't understand how the MV does look like in Databricks.

RCo
New Contributor III

That is a great point!

I did some testing and I was able to re-create your issue in my own environment by doing the following:

  1. Create a notebook with a `CREATE MATERIALIZED VIEW IF NOT EXISTS` statement that queries a Lakehouse Federated database.
  2. Manually create a DLT pipeline that runs this notebook
  3. Try to query the materialized view using Serverless SQL

Is this also the process you used to create the materialized view?

 

The documentation (https://learn.microsoft.com/en-us/azure/databricks/views/materialized) suggests that, instead of "manually" creating the DLT pipeline as outlined above, the "CREATE MATERIALIZED VIEW" statement should be ran on a pro or serverless SQL Warehouse.

I tried doing this, but encountered networking/firewall related errors for both pro and serverless warehouses.

 

I think this all suggests that trying to "SELECT" from a materialized view using a pro/serverless SQL Warehouse (when the materialized view create statement was not run using a pro/serverless SQL Warehouse) will implicitly try to refresh the materialized view.

pesky_chris
New Contributor III

Hey,

To clarify, as I think I'm potentially hitting Databricks unintended "functionality".

  • Materialised Views are managed by DLT pipeline, which was deployed with DABs off CI/CD pipeline,
  • DLT Pipeline runs a notebook with Python code creating MVs dynamically,
  • DLT Pipeline is set up with (self-hosted, or non-serverless if you prefer that name) compute,
  • The Compute (effectively VM) as it is span up in the Private Network is set up to have access to other VNets where Postgresql Database is hosted,
  • SQL Warehouse Serverless network configuration is in place (NCC + Private Links to Metastore Storage Account)

What works:

  • I can run DML against those MVs (with non-serverless) Compute,
  • Serverless (SQL Warehouse Serverless) errors but works with everything else

I reckon it is unexpected behaviour. I got admin permissions on the workspace to look up __databricks_internal catalog where delta tables supporting MVs are visible. I traversed DLT's compute spool log to see how they are constructed, the data is streamed into Delta Tables and then reconciled. Furthermore, I can run DML against those technical tables in __databricks_internal (respective schemas) with Serverless without problems.

This is a fragment from the Docs :

> The underlying files supporting materialized views might include data from upstream tables (including possible personally identifiable information) that do not appear in the materialized view definition. This data is automatically added to the underlying storage to support incremental refreshing of materialized views.[...]

 

 

Just a little update on it before I close it. This was likely a bug that has been addressed by Databricks, as things started working as expected without any change, created MV does not have dependency on the federated table.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group