cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

The Databricks Python SDK

RiyazAliM
Honored Contributor

The Databricks SDK is a script (written in Python, in our case) which lets you control and automate actions on Databricks using the methods available in the WorkSpaceClient (more about this below).

Why do we need Databricks SDK:

- Automation: You can do actions that you typically do on Unity Calalog (Catalogs/Schemas/Tables/Views/..), Declarative Pipelines (formerly DLT), Databricks Jobs (formerly Workflows), Clusters or any other component of the Databricks Ecosystem.

- Infrastructure as Code (IaC): Manage Databricks in Python / Terraform (Version Control, CI/CD)

- Advanced Controls: Do the things the UI can not do (business ops, complex dependency logics with pipelines or workflows)

We are going to see how we can interact with Databricks Ecosystem using the Python SDK.

First things first, let's see how we can install and setup the Databricks SDK: (If you're trying to use SDK on Databricks Workspace, you should be good to go, but if you're developing locally, the below installation steps are for you):

Install using PIP:

pip3 install databricks-sdk

If you wanna install a specific version, mention it explicitly as below:

pip3 install databricks-sdk==0.1.6

To authenticate and validate your Databricks credentials, use any one of the following methods mentioned over here.

You can also refer to this document for more details on authentication

As I'm working on Databricks Workspace, I'll show you the Personal Access Token method of authentication and instantiation of workspaceClient.

from databricks.sdk import WorkspaceClient

bearer_token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().getOrElse(None)
workspace_url = spark.conf.get("spark.databricks.workspaceUrl")

w = WorkspaceClient(token=bearer_token, host=workspace_url)

In the above code snippet, I've used dbutils' notebook entrypoint to grab the PAT and spark configuration to extract workspace URL.

I'm going to talk about a small example of how I can preserve the Grant Permissions on a view when it is replaced by a `CREATE OR REPLACE` statement.

Introduction to a view behaviour in Databricks: Let's say you have a set of Grants/permissions defined on a view, when you create or replace it, the grants are reset. More about the behaviour over here.

But you want the grants to be applied on the view even after the running the create or replace statement on the view, let's see how we can do this using the help of Python SDK for Databricks.

To replicate follow along:

Let's first create a view (I'm gonna use a catalog dev and schema default to store the view and use any table available in UC to create this view):

CREATE OR REPLACE VIEW dev.default.test_view as (SELECT * FROM dev.bronze.customer_raw)

I have then queried the grants available on the view:

SHOW GRANTS on dev.default.test_view

(note that few grants which are assigned on the catalog have inherited by the view):

aayrm5_1-1752841018665.png

Now, let's add a grant SELECT and ALL PRIVILEGES on to our view:

GRANT SELECT, ALL PRIVILEGES on dev.default.test_view to `harshavardhan.******@****.com`;

Now, if we run a SHOW GRANT on the view, you'd see Harshavardhan's email with select and all privileges popping up the query result (along with the privileges inherited from the catalog):

aayrm5_3-1752841258075.png

Now if you run a CREATE OR REPLACE on the view, you'd see how Harshavardhan wouldn't have the privileges anymore. And we want to store the privileges (and apply it back) when we re-create our view. Please execute the create or replace and check the grants for you to witness Harsha's email id disappear from the grants.

Let's get started with the Databricks SDK for Python and handle the usecase:

Recall the we've authenticated and created an object `w` the workspaceClient. We'll use the grant method of the workspace client to get the permissions available on the Table/View.

from databricks.sdk.service import catalog
from databricks.sdk.service.catalog import PermissionsChange, Privilege

direct_grants = w.grants.get(
    securable_type=catalog.SecurableType.TABLE,
    full_name="dev.default.test_view"
)

When you print the `direct_grants` object, this is what you'd see a `PermissionsList` which holds the `privilege_assignments` that contains `principal` and `privileges` defined on the view. Note that the `grants.get()` method that we used, won't return the inherited grants, only the assigned ones.

aayrm5_4-1752841851794.png

Now, let's run a CREATE OR REPLACE on the view again to see Harsha's grants disappear into thin air.

Let's use Databricks SDK to apply the stored grants back again.

w.grants.update(
    securable_type=catalog.SecurableType.TABLE,
    full_name="dev.default.test_view",
    changes=[
        PermissionsChange(
            principal=assignment.principal,
            add=[priv for priv in assignment.privileges]
        )
        for assignment in direct_grants.privilege_assignments
    ]
)

The grant.update() returns the PermissionsList similar to the grant.get() method. Now query the grants on the view using the SHOW GRANTS ON <VIEW_NAMESPACE> to see how Harsha's grants are assigned back.

You can also interact with DLT (Declarative Pipelines now) and Workflows. Check out the documentation for more details.

Thank you for staying tuned.

 

Riz
1 REPLY 1

sridharplv
Valued Contributor II

Good Article @RiyazAliM.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now