cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Can I use databricks principals on databricks connect 12.2?

Surajv
New Contributor III

Hi community,

Is it possible to use Databricks service principals for authentication on Databricks connect 12.2 to connect my notebook or code to Databricks compute, rather than using personal access token? 

I checked the docs and got to know that upgrading Databricks version to v13+ gives us access to using Databricks service principals in Databricks connect, but that also requires setting up Unity catalogue.

As per our current use case, we are restricting to using Databricks connect 12.2 only (unless no other way out) but that is limiting the authentication features.

Hence wanted to ask, if there's a way out or any piece of document which shows a way to use service principals in Databricks connect 12.2?

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @Surajv, Let’s explore the options for using Databricks service principals with Databricks Connect 12.2.

 

Databricks Connect lets you connect popular IDEs (such as Visual Studio Code and PyCharm), notebook servers, and other custom applications to Databricks clusters. It enables you to run Spark jobs remotely on a Databricks cluster instead of in your local Spark session. 

 

Now, regarding your specific question about using service principals:

 

Databricks Connect and Service Principals:

  • Databricks Connect does not directly support service principals for authentication in Databricks Runtime 12.2 LTS and below.
  • However, if you’re using Databricks Runtime 13.0 and above, you can use service principals to authenticate with Databricks Connect.
  • Unfortunately, service principals are not natively supported for your current use case with Databricks Connect 12.2.

Alternative Approach:

  • While Databricks Connect 12.2 doesn’t directly integrate with service principals, you can still achieve your goal by using a personal access token (PAT) for authentication.
  • Here’s how you can proceed:
    • Create a Databricks service principal within your Databricks account.
    • Add the service principal to your Databricks workspace and grant it appropriate permissions.
    • Generate a Databricks personal access token for the service principal.
    • Use this token programmatically in your code or notebook to authenticate with Databricks.
  • Although it’s not the same as using service principals directly, this approach allows you to work with Databricks Connect 12.2 while benefiting from token-based authentication.

Considerations:

  • Keep in mind that Databricks Connect 12.2 has limitations, and Databricks recommends upgrading to a newer version for better features and security.
  • If you’re dealing with SQL queries in Python, consider using the Databricks SQL Connector for Python instead of Databricks Connect, as it’s easier to set up and debug.

For detailed steps on creating a Databricks service principal, adding it to your workspace, and gene.... While Databricks Connect 12.2 may have limitations, it still provides a way to work with Databricks clusters from your preferred IDEs. 🚀

Surajv
New Contributor III

Hi @Kaniz_Fatma

Thanks for your response. I was able to generate the token of the service principal following this doc, later saved it in the <Databricks Token> variable prompted when running databricks-connect configure command in terminal. And was able to run the pyspark code as expected.  

But as I see read through the docs, the databricks principals token expires in 1 hr. I have few doubts: 

  1. Is there a way to increase the expiration time of the token? 
  2. Where is this export variable (i.e Databricks Token) or other databricks-connect variables (like Databricks Host, Cluster ID, Org ID) stored in environment?
    As I was trying to build a script which hits the service principals APIs to get the token and later exports the token value in databricks-connect configure command (doing something similar given in this link, so that I won't have to keep generating a new token everytime)
    I tried finding the variables using: <grep -iR "Databricks Token"> and other commands, but didn't get any useful results. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group