<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Connecting to Snowflake using an SSO user from Azure Databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10602#M5756</link>
    <description>&lt;P&gt;We had confirmation from Databricks that the approach detailed in my reply was the only workable approach for user-based SSO. We ended up replicating data across from Snowflake into Databricks in the end. Another approach is to use a service account etc, however this might break some data governance and audit policies if your org. requires such things.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Something else to consider is that if your query can be pushed down into Snowflake then it gets executed on the Snowflake side and the results returned via the connector. As soon as a user executes a query in Databricks that can't be pushed down (e.g., uses a UDF) then the data for every table referenced in that query is pulled through to Databricks and the query is executed on the Databricks side. This will have a very significant impact on query performance, therefore depending on your use-case, data volume, and frequency of Snowflake being queried you might want to consider replication.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.snowflake.com/en/user-guide/spark-connector-use#pushdown" target="test_blank"&gt;https://docs.snowflake.com/en/user-guide/spark-connector-use#pushdown&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 09 Jun 2023 01:17:21 GMT</pubDate>
    <dc:creator>ajbush</dc:creator>
    <dc:date>2023-06-09T01:17:21Z</dc:date>
    <item>
      <title>Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10599#M5753</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm just reaching out to see if anyone has information or can point me in a useful direction. I need to connect to Snowflake from Azure Databricks using the connector: &lt;A href="https://learn.microsoft.com/en-us/azure/databricks/external-data/snowflake" alt="https://learn.microsoft.com/en-us/azure/databricks/external-data/snowflake" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/external-data/snowflake&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The connector requires that users put their username and password in the connections details (or in a secret store), however these are the credentials for a Snowflake user and the &lt;A href="https://community.snowflake.com/s/article/HOW-TO-Setup-SSO-with-Azure-AD-and-the-Snowflake-New-URL-Format-or-Privatelink" alt="https://community.snowflake.com/s/article/HOW-TO-Setup-SSO-with-Azure-AD-and-the-Snowflake-New-URL-Format-or-Privatelink" target="_blank"&gt;Snowflake account is configured to use SSO with Azure as the IdP&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Looking through the documentation for the Snowflake JDBC connector to get the connector working when using SSO you need to configure: &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;.option("authenticator" , "externalbrowser")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;You can see that here: &lt;A href="https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso" alt="https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso" target="_blank"&gt;https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso&lt;/A&gt;&lt;/P&gt;&lt;P&gt;And an example of configuring the connector with Spark is here: &lt;A href="https://community.snowflake.com/s/article/How-to-connect-to-Snowflake-with-Spark-connector-via-SSO" alt="https://community.snowflake.com/s/article/How-to-connect-to-Snowflake-with-Spark-connector-via-SSO" target="_blank"&gt;https://community.snowflake.com/s/article/How-to-connect-to-Snowflake-with-Spark-connector-via-SSO&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When configuring the connector in this way I get indefinite hanging of the Spark job:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;snowflake_table = (spark.read
  .format("snowflake")
  .option("dbtable", "blah")
  .option("sfURL", "xxxxxxxx.snowflakecomputing.com/")
  .option("sfUser", "***")               #***** value
  .option("sfPassword", "***")      #***** value
  .option("authenticator" , "externalbrowser")
  .option("sfDatabase", "blah")
  .option("sfSchema", "blah")
  .option("sfWarehouse", "blah")
  .load()
)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;With a message in the driver logs:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The browser window for logging in never actually surfaces.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I found a similar comment from 2020 describing the problem: &lt;A href="https://stackoverflow.com/questions/64330577/connection-issue-databricks-snowflake#comment113795080_64334768" alt="https://stackoverflow.com/questions/64330577/connection-issue-databricks-snowflake#comment113795080_64334768" target="_blank"&gt;https://stackoverflow.com/questions/64330577/connection-issue-databricks-snowflake#comment113795080_64334768&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is it still the case that Databricks doesn't support that method of authentication? Are there any alternatives or workarounds besides creating local non-SSO users in Snowflake?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jan 2023 01:33:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10599#M5753</guid>
      <dc:creator>ajbush</dc:creator>
      <dc:date>2023-01-27T01:33:23Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10600#M5754</link>
      <description>&lt;P&gt;&lt;A href="https://community.snowflake.com/s/article/HOW-TO-Setup-SSO-with-Azure-AD-and-the-Snowflake-New-URL-Format-or-Privatelink" alt="https://community.snowflake.com/s/article/HOW-TO-Setup-SSO-with-Azure-AD-and-the-Snowflake-New-URL-Format-or-Privatelink" target="_blank"&gt;The above question refers to SAML authentication for SSO in Snowflake with AAD.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I can see this isn't going to work without some proxying magic of the localhost session on the driver that serves the authentication redirect.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Since then, I tried to go down the &lt;A href="https://docs.snowflake.com/en/user-guide/oauth-ext-overview.html#external-oauth-overview" alt="https://docs.snowflake.com/en/user-guide/oauth-ext-overview.html#external-oauth-overview" target="_blank"&gt;external OAUTH&lt;/A&gt; route, again using &lt;A href="https://docs.snowflake.com/en/user-guide/oauth-azure.html" alt="https://docs.snowflake.com/en/user-guide/oauth-azure.html" target="_blank"&gt;AAD&lt;/A&gt;. One can configure an Application Registration in Azure AD and create the integration in Snowflake. The issue then becomes how does an end user get an OAUTH2 Access Token within a Databricks notebook session, since they will need to authenticate with the registration from within the running Databricks driver session.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;Aside: Somehow Azure Databricks manages to create a valid OAUTH2 token for ADLS Gen1 and Gen2 with "aud": "&lt;/I&gt;&lt;A href="https://storage.azure.com" alt="https://storage.azure.com" target="_blank"&gt;&lt;I&gt;&lt;/I&gt;&lt;/A&gt;&lt;A href="https://storage.azure.com" target="test_blank"&gt;https://storage.azure.com&lt;/A&gt;&lt;I&gt;" for users when &lt;/I&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/data-governance/credential-passthrough/" alt="https://learn.microsoft.com/en-us/azure/databricks/data-governance/credential-passthrough/" target="_blank"&gt;&lt;I&gt;credential passthrough &lt;/I&gt;&lt;/A&gt;&lt;I&gt;is enabled however I have no idea what the mechanism they use to do this is. I would love to read some technical documentation on how this mechanism works under the hood, how the refresh and access tokens are generated and where they are stored.&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Looking at the &lt;A href="https://learn.microsoft.com/en-us/azure/active-directory/develop/msal-authentication-flows" alt="https://learn.microsoft.com/en-us/azure/active-directory/develop/msal-authentication-flows" target="_blank"&gt;list of authentication flows for OAUTH in msal,&lt;/A&gt; the only one that doesn't require a redirect flow (e.g. Authorization code) or credentials (e.g. Client credentials, ROPC)&lt;A href="https://learn.microsoft.com/en-us/azure/active-directory/develop/scenario-desktop-acquire-token-device-code-flow?tabs=python" alt="https://learn.microsoft.com/en-us/azure/active-directory/develop/scenario-desktop-acquire-token-device-code-flow?tabs=python" target="_blank"&gt; is Device code&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This involves running the following code in Databricks (the actual implementation can be abstracted away in a shared library) and authenticating in a different browser window / tab:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import msal
import logging
import json
import sys
&amp;nbsp;
config = {
    "client_id": "0fedeef6-71c3-42e4-ba4e-d6e2b443bd17",
    "authority": "https://login.microsoftonline.com/9c5da1da-3b7d-4eb6-a0db-b83ada116551",
    "scope": ["api://5b427fec-4148-4dcb-b488-9006ef357fda/session:scope:analyst"]
}
&amp;nbsp;
app = msal.PublicClientApplication(
    config["client_id"], authority=config["authority"],
    )
&amp;nbsp;
result = None
&amp;nbsp;
accounts = app.get_accounts()
if accounts:
    logging.info("Account(s) exists in cache, probably with token too. Let's try.")
    print("Pick the account you want to use to proceed:")
    for a in accounts:
        print(a["username"])
    chosen = accounts[0]
    result = app.acquire_token_silent(config["scope"], account=chosen)
&amp;nbsp;
if not result:
    logging.info("No suitable token exists in cache. Let's get a new one from AAD.")
&amp;nbsp;
    flow = app.initiate_device_flow(scopes=config["scope"])
    if "user_code" not in flow:
        raise ValueError(
            "Fail to create device flow. Err: %s" % json.dumps(flow, indent=4))
&amp;nbsp;
    print(flow["message"])
    sys.stdout.flush()
    
    result = app.acquire_token_by_device_flow(flow)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;which presents the following message in the notebook cell:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/781i748A66D35409FB33/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;A user can then present this OAUTH token in the &lt;A href="https://docs.snowflake.com/en/user-guide/jdbc-parameters.html#label-jdbc-additional-connection-parameters-token" alt="https://docs.snowflake.com/en/user-guide/jdbc-parameters.html#label-jdbc-additional-connection-parameters-token" target="_blank"&gt;JDBC connection details&lt;/A&gt;:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;snowflake_table1 = (spark.read
  .format("snowflake")
  .option("dbtable", "CALL_CENTER")
  .option("sfURL", "xxxxxxxxxx.snowflakecomputing.com/")
  .option("sfUser", "xxxxxxxxxxxxx")
  .option("sftoken", result["access_token"])
  .option("sfRole", "analyst")
  .option("sfAuthenticator" , "oauth",)
  .option("sfDatabase", "SNOWFLAKE_SAMPLE_DATA")
  .option("sfSchema", "TPCDS_SF100TCL")
  .option("sfWarehouse", "COMPUTE_WH")
  .load()
)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This approach is complex for end users and involves a convoluted authentication flow. Furthermore, if conditional access policies were introduced this method would no longer work:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/798iD502AE0611A4F8C4/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;B&gt;To conclude: &lt;/B&gt;This is the closest I've come to authenticating to a Snowflake instance that uses AAD for authentication, however it still feels very far from SSO. Is this the only was to get OAUTH tokens to an application in the same tenancy as Databricks for the user logged into Databricks? How does credential passthrough achieve this? Is there anything else I can try?&lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2023 00:32:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10600#M5754</guid>
      <dc:creator>ajbush</dc:creator>
      <dc:date>2023-02-01T00:32:44Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10601#M5755</link>
      <description>&lt;P&gt;@Alex Bush​&amp;nbsp;did you ever find a better solution to this issue? I have the same problem and want to know the best practice for connecting to Snowflake from a Databricks notebook with SSO.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Jun 2023 22:28:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10601#M5755</guid>
      <dc:creator>yhyhy3</dc:creator>
      <dc:date>2023-06-07T22:28:18Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10602#M5756</link>
      <description>&lt;P&gt;We had confirmation from Databricks that the approach detailed in my reply was the only workable approach for user-based SSO. We ended up replicating data across from Snowflake into Databricks in the end. Another approach is to use a service account etc, however this might break some data governance and audit policies if your org. requires such things.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Something else to consider is that if your query can be pushed down into Snowflake then it gets executed on the Snowflake side and the results returned via the connector. As soon as a user executes a query in Databricks that can't be pushed down (e.g., uses a UDF) then the data for every table referenced in that query is pulled through to Databricks and the query is executed on the Databricks side. This will have a very significant impact on query performance, therefore depending on your use-case, data volume, and frequency of Snowflake being queried you might want to consider replication.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.snowflake.com/en/user-guide/spark-connector-use#pushdown" target="test_blank"&gt;https://docs.snowflake.com/en/user-guide/spark-connector-use#pushdown&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 01:17:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/10602#M5756</guid>
      <dc:creator>ajbush</dc:creator>
      <dc:date>2023-06-09T01:17:21Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/37555#M26386</link>
      <description>&lt;P&gt;For people stumbling on this question, I just want to say the Single Sign On (SSO) method now works with Databricks runtime version 11.2 or newer. This is as interactive input is now possible, thus the external browser option works.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2023 10:56:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/37555#M26386</guid>
      <dc:creator>ludgervisser</dc:creator>
      <dc:date>2023-07-13T10:56:25Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/50216#M28740</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/84743"&gt;@ludgervisser&lt;/a&gt;&amp;nbsp;Do you have an example or some documentation you can point to that shows how to configure this?&lt;/P&gt;</description>
      <pubDate>Tue, 31 Oct 2023 19:33:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/50216#M28740</guid>
      <dc:creator>BobGeor_68322</dc:creator>
      <dc:date>2023-10-31T19:33:54Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/67163#M33293</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/84743"&gt;@ludgervisser&lt;/a&gt;&amp;nbsp;We are trying to connect to Snowflake via Azure AD user through the externalbrowser method but the browser window doesn't open. Could you please share an example code of how you managed to achieve this, or to some documentation?&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/42251"&gt;@BobGeor_68322&lt;/a&gt;&amp;nbsp;did you find anything, by any chance?&lt;/P&gt;</description>
      <pubDate>Wed, 24 Apr 2024 07:21:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/67163#M33293</guid>
      <dc:creator>aagarwal</dc:creator>
      <dc:date>2024-04-24T07:21:42Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/85634#M37242</link>
      <description>&lt;P&gt;I'm also trying this very same thing, connecting Databricks to Snowflake via Azure AD. An example would really be helpful.&lt;/P&gt;</description>
      <pubDate>Wed, 28 Aug 2024 07:09:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/85634#M37242</guid>
      <dc:creator>linnoinen</dc:creator>
      <dc:date>2024-08-28T07:09:52Z</dc:date>
    </item>
    <item>
      <title>Re: Connecting to Snowflake using an SSO user from Azure Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/86854#M37347</link>
      <description>&lt;P&gt;we ended up using device flow oauth because, as noted above, it is not possible to launch a browser on the Databricks cluster from a notebook so you cannot use "externalBrowser" flow. It gives you a url and a code and you open the url in a new tab and then copy and paste the code and it gets a token in the notebook. It is awkward but does work for interactive session.&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import msal
import logging
import json
import sys
 
config = {
    "client_id": dbutils.secrets.get("&amp;lt;your secret scope&amp;gt;", "Snowflake-Application-Id"),
    "authority": dbutils.secrets.get("&amp;lt;your ecret scope&amp;gt;", "Snowflake-Login-Authority-Url"),
    "scope": [dbutils.secrets.get("&amp;lt;your secret scope&amp;gt;", "Snowflake-Application-Scope")],
}
 
app = msal.PublicClientApplication(config["client_id"], authority=config["authority"])
 
result = None

if "snowflake_access_token" not in locals():
    flow = app.initiate_device_flow(scopes=config["scope"])

    if "user_code" not in flow:
        raise ValueError("Fail to create device flow. Err: %s" % json.dumps(flow, indent=4))
    print(flow["message"])
    sys.stdout.flush()
    
    result = app.acquire_token_by_device_flow(flow)
    snowflake_access_token = result["access_token"]&lt;/LI-CODE&gt;&lt;LI-CODE lang="python"&gt;# Set Snowflake options below.
sfOptions = {
"sfURL" : "&amp;lt;your account&amp;gt;.snowflakecomputing.com",
"sfUser" : "&amp;lt;your user&amp;gt;",
"sfAuthenticator" : "oauth",
"sfToken" : snowflake_access_token,
"sfDatabase" : "",
"sfSchema" : "",
"sfWarehouse" : "&amp;lt;your warehouse&amp;gt;"
}

# Connect to Snowflake and build data frame.
df = spark.read.format("snowflake") \
.options(**sfOptions) \
.option("query", "select * from &amp;lt;some table&amp;gt;") \
.load()

# Output results of above query.
display(df)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Aug 2024 12:44:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connecting-to-snowflake-using-an-sso-user-from-azure-databricks/m-p/86854#M37347</guid>
      <dc:creator>BobGeor_68322</dc:creator>
      <dc:date>2024-08-30T12:44:16Z</dc:date>
    </item>
  </channel>
</rss>

