cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Run Delta Live Tables as service principal

knutasm
New Contributor III

How to run a delta live tables pipeline in production? It uses the owner's (creator's) permissions for writing to tables, and I can't change the owner of a UC-enabled pipeline after creation. I don't want regular users to have write access to prod tables, but run as a service principal instead.

I can find no example in the documentation, but it seems like such a glaring shortcoming that I am sure I must have missed something.

7 REPLIES 7

knutasm
New Contributor III

Thank you.

However, like you say the owner of the pipeline cannot be changed once set for a UC pipeline. Yet I am unable to choose pipeline owner when creating the pipeline in the first place, which is where I would have to set the service principal. I am also unable to create a pipeline as a service principal.


@knutasm wrote:

Thank you.

However, like you say the owner of the pipeline cannot be changed once set for a UC pipeline. Yet I am unable to choose pipeline owner when creating the pipeline in the first place, which is where I would have to set the service principal. I am also unable to create a pipeline as a service principal.



Hmm.  I wonder if creating the pipeline could it be done with the API?  I've not seen anything about that in my browsing around.  Even if it that's an option, I wouldn't say that's an ideal option.

 

I'm going to assume that DLT with UC is simply not finished baking yet and that this will change sometime...  In the meantime, I might not jump in to DLT with both feet just yet.

Oliver_Angelil
Valued Contributor II

Have the same issue.

serelk
New Contributor III

Same here 

js54123875
New Contributor III

same!

AmanSehgal
Honored Contributor III

Configure pipeline permissions

You must have the CAN MANAGE or IS OWNER permission on the pipeline to manage permissions. Pipelines use access control lists (ACLs) to control permissions. For a complete list of permissions and their abilities.

  1. In the sidebar, click Delta Live Tables.

  2. Select the name of a pipeline.

  3. Click Share. The Permissions Settings dialog appears.

  4. Click Select User, Group or Service Principalโ€ฆ and select a user, group, or service principal.

  5. Select a permission from the permission drop-down menu.

  6. Click Add.

  7. Click Save.

ashwini0723
New Contributor II

@knutasmI have build the solution for it. The way to create DLT pipeline using SPN is to write a code wherein via databricks API a new DLT pipeline will be created and you mentioned owner as a service principal in the API code as shown below. Below method worked for me ๐Ÿ™‚

import requests

import requests
from azure.identity import ClientSecretCredential

databricks_instance = ""
client_id = dbutils.secrets.get(scope="", key="")
client_secret = dbutils.secrets.get(scope="", key="")
tenant_id = ""
#pipeline_id = ""

# Obtain token
token_response = requests.post(
    f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token",
    data={
        'grant_type': 'client_credentials',
        'client_id': client_id,
        'client_secret': client_secret,
        'scope': '2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default'
    }
)

access_token = token_response.json().get("access_token")

if not access_token:
    print(f"Error obtaining token: {token_response.text}")
else:
    headers = {
        'Authorization': f'Bearer {access_token}',
        'Content-Type': 'application/json'
    }

    # Define the pipeline creation payload using Unity Catalog
    create_pipeline_payload = {
        "name": "",
        "catalog": "",  # Specify the Unity Catalog name
        "target": "",    # Specify the schema within the Unity Catalog
        "clusters": [
            {
            "label": "default",
            "node_type_id": "Standard_DS3_v2",
            "autoscale": {
                "min_workers": 1,
                "max_workers": 2,
                "mode": "ENHANCED"
            }
            }
        ],
        "libraries": [
            {
            "notebook": {
                "path": ""
            }
            }
        ],
        "permissions": [
            {
                "service_principal_name": '',
                "permission_level": "IS_OWNER"  # Set the desired permission level
            }
        ],
        "development": True  # Set to False for production mode
    }

    # Make the API request to create the pipeline
    response = requests.post(
        f'{databricks_instance}/api/2.0/pipelines',
        headers=headers,
        json=create_pipeline_payload
    )

    # Check the response
    if response.status_code == 200:
        pipeline_id = response.json().get("pipeline_id")
        print(f"Pipeline created successfully with ID: {pipeline_id}")
    else:
        print(f"Failed to create pipeline: {response.status_code}, {response.text}")