cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Can I use databricks principals on databricks connect 12.2?

Surajv
New Contributor III

Hi community,

Is it possible to use Databricks service principals for authentication on Databricks connect 12.2 to connect my notebook or code to Databricks compute, rather than using personal access token? 

I checked the docs and got to know that upgrading Databricks version to v13+ gives us access to using Databricks service principals in Databricks connect, but that also requires setting up Unity catalogue.

As per our current use case, we are restricting to using Databricks connect 12.2 only (unless no other way out) but that is limiting the authentication features.

Hence wanted to ask, if there's a way out or any piece of document which shows a way to use service principals in Databricks connect 12.2?

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Surajv, Letโ€™s explore the options for using Databricks service principals with Databricks Connect 12.2.

 

Databricks Connect lets you connect popular IDEs (such as Visual Studio Code and PyCharm), notebook servers, and other custom applications to Databricks clusters. It enables you to run Spark jobs remotely on a Databricks cluster instead of in your local Spark session. 

 

Now, regarding your specific question about using service principals:

 

Databricks Connect and Service Principals:

  • Databricks Connect does not directly support service principals for authentication in Databricks Runtime 12.2 LTS and below.
  • However, if youโ€™re using Databricks Runtime 13.0 and above, you can use service principals to authenticate with Databricks Connect.
  • Unfortunately, service principals are not natively supported for your current use case with Databricks Connect 12.2.

Alternative Approach:

  • While Databricks Connect 12.2 doesnโ€™t directly integrate with service principals, you can still achieve your goal by using a personal access token (PAT) for authentication.
  • Hereโ€™s how you can proceed:
    • Create a Databricks service principal within your Databricks account.
    • Add the service principal to your Databricks workspace and grant it appropriate permissions.
    • Generate a Databricks personal access token for the service principal.
    • Use this token programmatically in your code or notebook to authenticate with Databricks.
  • Although itโ€™s not the same as using service principals directly, this approach allows you to work with Databricks Connect 12.2 while benefiting from token-based authentication.

Considerations:

  • Keep in mind that Databricks Connect 12.2 has limitations, and Databricks recommends upgrading to a newer version for better features and security.
  • If youโ€™re dealing with SQL queries in Python, consider using the Databricks SQL Connector for Python instead of Databricks Connect, as itโ€™s easier to set up and debug.

For detailed steps on creating a Databricks service principal, adding it to your workspace, and gene.... While Databricks Connect 12.2 may have limitations, it still provides a way to work with Databricks clusters from your preferred IDEs. ๐Ÿš€

Surajv
New Contributor III

Hi @Kaniz

Thanks for your response. I was able to generate the token of the service principal following this doc, later saved it in the <Databricks Token> variable prompted when running databricks-connect configure command in terminal. And was able to run the pyspark code as expected.  

But as I see read through the docs, the databricks principals token expires in 1 hr. I have few doubts: 

  1. Is there a way to increase the expiration time of the token? 
  2. Where is this export variable (i.e Databricks Token) or other databricks-connect variables (like Databricks Host, Cluster ID, Org ID) stored in environment?
    As I was trying to build a script which hits the service principals APIs to get the token and later exports the token value in databricks-connect configure command (doing something similar given in this link, so that I won't have to keep generating a new token everytime)
    I tried finding the variables using: <grep -iR "Databricks Token"> and other commands, but didn't get any useful results. 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.