Thanks for the detailed context—here’s how to get Shiny-based apps working with your current setup and data.
1) Accessing data from “Catalog Explorer” in Databricks Apps
A few key points about the Databricks Apps environment and data access:
-
Apps support Python and Node.js, not R. So R Shiny won’t run in Databricks Apps today.
-
System-level installs (apt-get/conda) aren’t supported in Apps, but you can manage Python dependencies with a requirements.txt in your app (they’re installed during app build/deploy).
-
Apps are designed to integrate with Unity Catalog and Databricks SQL, and the recommended pattern is to query data through a SQL warehouse using the Databricks SQL connector for Python, then render in your app UI.
If your organization is using the legacy Hive Metastore (the “Catalog Explorer” UI can browse both legacy HMS and UC objects), here are the practical options to get data into an App:
-
Preferred: Bring those tables under Unity Catalog governance (although I understand this might not be an option). Once the tables are governed in UC, your app can query them via Databricks SQL with the permissions model you expect.
-
Near-term workaround: Query through a SQL warehouse from your App using the Databricks SQL connector for Python. This lets you avoid trying to start a Spark session inside the App (not recommended) and returns a Pandas-friendly result set for visualization.
-
When UC isn’t available and you must read files directly: Serverless compute can use instance profiles to access non-UC data in cloud storage. This bypasses catalog metadata but may be acceptable for flat-file reads; it’s best used as a temporary bridge until UC is in place.
2. Running Shiny for Python in a Notebook
Your analysis is spot on: R has special integrations to automatically proxy the app and show a URL, while the Python version doesn't.
Your code is defining the app, but it's never running the web server. You just need to add the command to run it.
First, your observation about the R notebook cell getting "stuck" or "waiting" is normal and expected. A Shiny app is a web server. When you run it, it takes control of the cell and runs continuously to serve the app. To stop it, you must interrupt the cell or stop the cluster. The same will happen with the Python solution.
How to Launch Your Python App: You need to explicitly tell the app to run and, critically, to run on host="0.0.0.0" so it's accessible outside the driver's local machine:
from shiny import ui, render, App
import pandas as pd # Make sure pandas is installed on your cluster
# --- 1. Your App Definition (No changes) ---
app_ui = ui.page_fluid(
ui.input_slider("n", "N", 0, 100, 40),
ui.output_text_verbatim("txt"),
)
def server(input, output, session):
@render.text
def txt():
return f"n*2 is {input.n() * 2}"
app = App(app_ui, server)
# --- 2. Code to Launch in a Databricks Notebook ---
# Define a port for your app
port = 8080
# Get the cluster and workspace info
host_name = dbutils.notebook.entry_point.getDbutils().notebook().getContext().tags().apply('browserHostName')
cluster_id = spark.conf.get("spark.databricks.clusterUsageTags.clusterId")
# Construct the proxy URL
app_url = f"https://{host_name}/driver-proxy/o/{cluster_id}/{port}/"
# Display the URL as a clickable link in the cell output
displayHTML(f'Your Shiny app is running! <a href="{app_url}" target="_blank">Click here to open it.</a>')
# --- 3. Run the App ---
# This command will run forever and show "waiting"
# This is normal! Click the link generated above to see your app.
app.run(host="0.0.0.0", port=port)