In today's dynamic application development landscape, it is paramount to be able to leverage your own custom libraries within an App environment. These libraries often contain proprietary logic, critical algorithms, or specific data access patterns that are essential to your application's functionality. Furthermore, using a private repository like JFrog Artifactory for these libraries significantly enhances security by isolating your code from public repositories and providing granular access control. This ensures that your valuable intellectual property remains protected.
This blog post guides you on how to install a private PyPI package from JFrog Artifactory within a Databricks App (DBApps) environment.
For this blog post, we have created a private PyPI package named “custom_calculator” in JFrog Artifactory and showed how that private package can be installed within DBApps. The package custom_calculator implements some simple arithmetic operations:
"""
Basic arithmetic operations module.
"""
def add(a: float, b: float) -> float:
"""Add two numbers."""
return a + b
def subtract(a: float, b: float) -> float:
"""Subtract b from a."""
return a - b
def multiply(a: float, b: float) -> float:
"""Multiply two numbers."""
return a * b
def divide(a: float, b: float) -> float:
"""Divide a by b."""
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
We won’t focus on how this project will be built and packaged. Any standard python packaging solution should be able to build the project and generate a wheel file encapsulating the custom_calculator package. This package is then uploaded into and indexed inside a Jfrog Artifactory repository. Below is how it appears in the repository:
You can use this article to create a Databricks App. The steps we outline below can be used to make the package custom_calculator accessible from this App’s environment.
To connect your Databricks App to your private repository, you first need the specific PyPI index-URL provided by your JFrog Artifactory instance. This URL acts as the entry point for accessing your private packages. Below is the format of the index-url from JFrog:
https://<username>:<password>@mycompany.jfrog.io/artifactory/api/pypi/pypi-local/simple
You can notice that credentials are included in the provided URL, which are essential for authenticating with JFrog Artifactory and accessing libraries. To maintain security and avoid exposing these credentials in application code, the index-url should be stored as a Databricks Secret. This approach enables secure credential management and ensures proper referencing within the Databricks environment.
Using Databricks CLI’s Secrets module, you can add a secret named, say, jfrog-pypi-url into a Scope (a container of secrets).
databricks secrets create-scope my-secret-scope
databricks secrets put-secret my-secret-scope jfrog-pypi-url
Once the secret is created, you need to make it accessible to your application. This is done by adding a pointer to the secret in the Databricks Apps - Resources section. This step links the secret to your app's deployment configuration.
The app.yaml file in a Databricks app defines how your app is executed. In your app.yaml file, add a PIP_EXTRA_INDEX_URL environment variable. This variable tells pip to look for packages in your private JFrog Artifactory. In this case, pip first checks the public PyPI and then falls back to your private Artifactory. The value of this variable should be sourced from the 'key' you defined in the previous step, ensuring that the Artifactory URL is dynamically retrieved from the secret.
app.yaml:
command: [
"flask",
"--app",
"app.py",
"run"
]
env:
- name: "PIP_EXTRA_INDEX_URL"
valueFrom: "external-pypi-url"
To ensure that your private library is installed during the app's deployment, add its name (e.g., custom_calculator) to your requirements.txt file. This file lists all the Python dependencies required by your application.
requirements.txt:
flask
custom_calculator
As a part of demonstrating the successful import of this package, we will display the execution of these arithmetic operations inside a Flask Application, created as a Databricks App. The app.py file for this application looks like this:
from flask import Flask, render_template_string, request
from custom_calculator import operations
import logging
log = logging.getLogger('werkzeug')
log.setLevel(logging.ERROR)
app = Flask(__name__)
HTML_TEMPLATE = """
<!DOCTYPE html>
<html>
<head>
<title>Custom Calculator Sample Execution</title>
</head>
<body>
<h1>Custom Calculator Sample Execution</h1>
<form method="post">
<label for="a">a:</label>
<input type="number" step="any" name="a" id="a" value="{{ a }}">
<label for="b">b:</label>
<input type="number" step="any" name="b" id="b" value="{{ b }}">
<button type="submit">Calculate</button>
</form>
{% if results %}
<ul>
<li>Add: {{ a }} + {{ b }} = {{ results['add'] }}</li>
<li>Subtract: {{ a }} - {{ b }} = {{ results['subtract'] }}</li>
<li>Multiply: {{ a }} * {{ b }} = {{ results['multiply'] }}</li>
<li>Divide: {{ a }} / {{ b }} = {{ results['divide'] }}</li>
</ul>
{% endif %}
</body>
</html>
"""
@app.route("/", methods=["GET", "POST"])
def index():
a = request.form.get("a", 10)
b = request.form.get("b", 5)
results = None
try:
a_val = float(a)
b_val = float(b)
results = {
"add": operations.add(a_val, b_val),
"subtract": operations.subtract(a_val, b_val),
"multiply": operations.multiply(a_val, b_val),
}
try:
results["divide"] = operations.divide(a_val, b_val)
except Exception as e:
results["divide"] = f"Error: {e}"
except Exception:
results = None
return render_template_string(HTML_TEMPLATE, a=a, b=b, results=results)
if __name__ == "__main__":
app.run(debug=True)
After configuring your `app.yaml` and `requirements.txt`, deploy and start your Databricks App. You can then verify the successful installation of your private library by checking the Environment tab in your Databricks App's interface. This provides confirmation that your dependencies were correctly resolved and installed.
Finally, run the App to verify that the logic defined in the imported package is working correctly.
In a nutshell, integrating your private PyPI libraries from JFrog Artifactory into your Databricks Apps environment involves securely storing your Artifactory URL as a secret, referencing that secret in your app's configuration, and specifying your private library in your application's requirements. This process ensures that your custom code is deployed and available within your Databricks application securely and efficiently.
Build Apps on Databricks today!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.