โ03-10-2023 09:43 AM
Hi there,
I'm helping a client of mine set up an Azure Databricks environment. The workspace is set up for private access only, and we are using Azure Firewall and Azure Private Link.
We have the network environment successfully configured to the point where we are able to start clusters in the workspace. However, when trying to run a simple SQL statement from within a notebook, I'm getting a very strange error:
CREATE CATALOG IF NOT EXISTS quickstart_catalog
com.databricks.common.client.UnexpectedHttpError: HTTP request failed with status: HTTP/1.1 302 Found
Looking at the Azure FW application rule logs, I see traffic outbound from the cluster to canadacentral.azuredatabricks.net, and it looks like the 302 is coming from a redirect following a failed authentication attempt. I suspect that the cluster is getting an HTML payload in response from the server (the Databricks login page), and the 302 is the redirect that happens when authentication fails (e.g. when trying to connect to a private workspace from outside the network). This is reinforced by what I see when I try to run USE CATALOG - the HTML payload that comes back is the Databricks login page:
USE CATALOG main
Py4JJavaError: An error occurred while calling o336.sql.
: shaded.v245.com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: <!doctype html><html><head><meta charset="utf-8"/><meta http-equiv="Content-Language" content="en"/><title>Databricks - Sign In</title><meta name="viewport" content="width=960"/><link rel="icon" type="image/png" href="/favicon.ico"/><meta http-equiv="content-type" content="text/html; charset=UTF8"/><link rel="icon" href="/favicon.ico"><script defer="defer" src="/static/js/login/login.fb760649.js"></script></head><body class="light-mode"><uses-legacy-bootstrap><div id="login-page"></div></uses-legacy-bootstrap></body></html>; line: 1, column: 2]
I suspect that something is incorrect with the Azure Firewall/Private Link setup, but I'm not entirely sure what. Quick summary:
I know the endpoints are working to some extent because I am able to both log into the workspace and start clusters (meaning the secure cluster connectivity relay is being established). However, I'm not really sure why attempts to run DB SQL are returning what looks like authentication errors from the DB login page. I suspect I am missing something with the Private Link/AFW setup. Any help would be much appreciated!
EDIT TO ADD:
When I try to use a non-UC catalog (such as the default hive_metastore), I don't get any errors. It's only when trying to run DB SQL against a UC-backed catalog.
โ03-11-2023 06:23 AM
Are you able to connect to ADLS container that has been configured as the root storage of the UC metastore. <container-name>@<storage-account-name>.dfs.core.windows.net should resolve to the private IP of the resource.
โ03-13-2023 11:58 AM
I just set up a Private Link endpoint for the storage account (there was not one set up previously). However, even though nslookup from a running cluster shows resolution to the private IP of the endpoint, I'm still seeing the UnexpectedHttpError: 302 when trying to do anything with the metastore from Databricks.
โ03-21-2023 10:59 PM
Hi @Nick Barrettaโ
I'm sorry you could not find a solution to your problem in the answers provided.
Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.
I suggest providing more information about your problem, such as specific error messages, error logs or details about the steps you have taken. This can help our community members better understand the issue and provide more targeted solutions.
Alternatively, you can consider contacting the support team for your product or service. They may be able to provide additional assistance or escalate the issue to the appropriate section for further investigation.
Thank you for your patience and understanding, and please let us know if there is anything else we can do to assist you.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group