cancel
Showing results for 
Search instead for 
Did you mean: 
Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Library Management via Custom Compute Policies and ADF Job Triggering

SashankKotta
New Contributor III
New Contributor III

This guide is intended for those looking to install libraries on a cluster using a Custom Compute Policy and trigger Databricks jobs from an Azure Data Factory (ADF) linked service. While many users rely on init scripts for library installation, it is recommended to use Custom Compute Policies for this purpose. Custom Compute Policies provide better control and management over library installations and can be configured to ensure compliance with organizational standards.

Follow these steps to achieve this:

  1. Create a Custom Compute Policy:

    • Navigate to the Policies tab: Compute > Policies tab.
    • Create a policy, Navigate to the library and add libraries as per your requirement.
    • Please note, you can install libraries using jar/wheel/Maven co-ordinates.
    • For more details, follow the detailed instructions in the link: Create and manage compute policies.
    • After creating the custom compute policy, you will need to use the policy ID when executing the notebook via Azure Data Factory (ADF).Screenshot 2024-06-16 at 12.34.09 PM.png

       

  2. Trigger a Databricks Notebook from ADF:
    • Use the Custom Compute Policy created in Step 1 to initiate your Databricks notebook via Azure Data Factory (ADF).

    • Using the ADF Linked service, we can trigger a notebook with our custom policy attached to the job/all-purpose cluster. Navigate to the Advanced option in the linked service and update the policy as shown below.Screenshot 2024-06-16 at 12.38.33 PM.png

Summary:

This guide explains how to install libraries on a Databricks cluster using a Custom Compute Policy and trigger Databricks jobs from an Azure Data Factory (ADF) linked service. It details creating a Custom Compute Policy, noting the policy ID, and using this policy ID to execute notebooks via ADF. This method is recommended over using init scripts for library installation.

 

Sashank Kotta
1 REPLY 1

Sujitha
Community Manager
Community Manager

@SashankKotta Thank you for sharing!

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!