Databricks Community

epandya

[Written and tested by James Broadhead, Engineering Manager; Edited and published by Evan Pandya, Product Marketing Manager]

Intro

Let's say your team has analytics data in a SQL Warehouse and leadership wants more than a dashboard. They want an app where ops reps can look up bookings, flag issues, add notes, and ask questions in plain English. You could wire up Express, a React app, SQL connectors, a Postgres pool, caching, SSE, and auth from scratch. Or you could use AppKit with your favorite coding agent and have it running by lunch.

This post builds a real operations app using `samples.wanderbricks` - a vacation rental marketplace dataset that ships with every Databricks workspace. It has 16 tables covering bookings, users, properties, payments, reviews, destinations, clickstream, and more.

All code snippets live in the git repo. Note: you don't need to clone the repo - we will curl individual files as we go.

What we're building

A single-page app with four capabilities:

Revenue by destination — analytics chart and table, queried from a SQL Warehouse
Booking manager — ops reps look up a booking and see guest details, powered by a SQL Warehouse query
User actions — reps can flag bookings for review and add notes. These actions are persisted in Lakebase.
Conversational AI — a Genie chat panel where users ask questions about the data in plain English

This goes beyond what a standard dashboard can do. Dashboards display data. This app lets people act on it — flag a suspicious booking, leave a note for the next shift, ask a follow-up question in natural language.

Architecture

Screenshot 2026-04-29 at 2.42.35 PM.png

Since your analytics data already lives in the warehouse, you don't need to copy it anywhere. However, apps need to write too - flags, notes, status changes. That's where Lakebase comes in. Not as a mirror of your warehouse data, but a fully-managed serverless Postgres database that can store your app's state.

SQL Warehouse handles all reads - revenue aggregations, booking lookups, everything that queries `samples.wanderbricks`
Lakebase handles writes - only the data the app creates: booking flags and notes. The server auto-creates these tables on first startup.
Genie provides conversational AI over the same warehouse data

The fast path: hand it to a coding agent

If you use an AI coding assistant (Cursor, Copilot, Claude Code, Genie Code, etc.), you can skip the manual walkthrough entirely:

```bash

# install AI skills; this command will no longer be experimental in future databricks versions

databricks experimental aitools install

curl -sL https://github.com/databricks/devhub/blob/main/examples/vacation-rentals/blog-post-snippets/setup/ag... -o agent_prompt.md

claude> “Follow the instructions in agent_prompt.md” # this is the last step! Grab a coffee!

```

The agent will detect your workspace, find or create a SQL Warehouse, set up a Genie space, scaffold the project, wire up all the plugins, deploy to Databricks, and print the URL of the running app. No manual steps required. View the prompt here.

The rest of this post walks through the same app by hand, which will be useful if you want to understand what each layer does or if you want to customize the agent's output.

Before you start

Before running through this walkthrough, confirm your local tools and Databricks workspace are ready. The fastest way:
curl -sL https://github.com/databricks/devhub/blob/main/examples/vacation-rentals/blog-post-snippets/setup/ve... | bash

This script probes everything that's pre-flight-checkable via the CLI and prints a green/red checklist.

What it expects:

Databricks CLI is installed and logged in
jq, curl, bash
Node.js 20 or newer + npm

Workspace access

Permission to read samples.wanderbricks (the dataset must exist and be UC-shared to you)
Genie, Databricks Apps, and Lakebase enabled in your workspace
Permission to list SQL warehouses
If you have multiple workspaces configured with the CLI, the script will print instructions on how to select the right one.

Note: Permissions the CLI cannot pre-check - the Databricks CLI has no "can I create X?" probe. The walkthrough creates a SQL warehouse (if you have none), a Genie space, and an App deployment - those will fail at create time if you lack the right entitlements. If that happens, the setup scripts print contextual error messages explaining the most common causes, and the troubleshooting section at the bottom of this post covers the rest.

Step 1: Scaffold the project

```bash

databricks apps init

```

The CLI walks you through naming your app and selecting plugins. Pick Analytics, Genie, and Lakebase when prompted.

```bash

cd wanderbricks-ops

npm install

```

You'll see this structure:

wanderbricks-ops/
├── server/
│   └── server.ts
├── client/
│   ├── src/
│   │   ├── main.tsx
│   │   └── App.tsx
│   └── vite.config.ts
├── config/
│   └── queries/
├── .env
└── package.json

Step 2: Configure your workspace connection

This script detects your host, finds (or creates) a SQL Warehouse, creates a Genie space, provisions a Lakebase Autoscaling Postgres project, and writes everything to .env:

curl -sL https://github.com/databricks/devhub/blob/main/examples/vacation-rentals/blog-post-snippets/setup/co... | bash

Here's what it does:

Reads DATABRICKS_HOST from your CLI auth config
Picks the first Pro serverless warehouse it can find, preferring one that's already running. If you have none eligible, creates a serverless appkit-dev warehouse
Creates a Genie space called "Wanderbricks" backed by the sample bookings, properties, destinations, and reviews tables
Provisions a Lakebase Autoscaling project (appkit-dev) and discovers its default branch, primary endpoint, and database
Writes everything to .env

After it runs, your .env will look like:

```

# Workspace identity

DATABRICKS_HOST=https://my-workspace.cloud.databricks.com

# Resource IDs the runtime reads

DATABRICKS_WAREHOUSE_ID=abc123def456

DATABRICKS_GENIE_SPACE_ID=01ef234567890abc

LAKEBASE_ENDPOINT=projects/appkit-dev/branches/production/endpoints/primary

PGHOST=ep-xxxx.database.us-west-2.cloud.databricks.com

PGDATABASE=databricks_postgres

# Used by 'databricks apps init --set' in Step 3 (not by the running app)

LAKEBASE_BRANCH=projects/appkit-dev/branches/production LAKEBASE_DATABASE=projects/appkit-dev/branches/production/databases/db-xxxx

```

Note: If you set DATABRICKS_CONFIG_PROFILE to pick a workspace earlier, the script also writes that line

That's the entire environment setup. No tables to copy, no pipelines to create, no connection strings to find.

Step 3: Set up the server

Pull down the server file:

curl -sL https://github.com/databricks/devhub/blob/main/examples/vacation-rentals/blog-post-snippets/server/s...
-o server/server.ts

It does three things: 1) registers the AppKit plugins, 2) auto-creates the booking_flags and booking_notes Lakebase tables on first start, and 3) adds five custom routes on top of the plugin-provided ones:

Route	Purpose
POST /api/bookings/:id/flag	Flag a booking for review
DELETE /api/bookings/:id/flag	Unflag
GET /api/bookings/:id/flag	Check flag status
POST /api/bookings/:id/notes	Add a note
GET /api/bookings/:id/notes	List notes (newest first)

The Lakebase queries use standard pg.Pool API with $1 placeholders. AppKit handles the Databricks-specific parts: OAuth token rotation, connection pooling, and OpenTelemetry instrumentation.

Step 4: Add the analytics queries

AppKit uses file-based SQL queries. Drop a `.sql` file in `config/queries/` and it becomes an API endpoint automatically.

```bash

curl -sL https://github.com/databricks/devhub/blob/main/examples/vacation-rentals/blog-post-snippets/config/q... \

-o config/queries/revenue_by_destination.sql

curl -sL https://github.com/databricks/devhub/blob/main/examples/vacation-rentals/blog-post-snippets/config/q... \

-o config/queries/booking_detail.sql

```

Both query `samples.wanderbricks` directly — no copying or syncing needed. A few things to notice:

`:paramName` placeholders are parameterized — no SQL injection risk
`-- @Param` annotations declare the parameter types (`DATE`, `NUMERIC`, `STRING`, etc.)
The filename is the query key — `bookingdetail.sql` maps to `"bookingdetail"` in your frontend code

Service principal vs. user context. The filename controls who runs the query:

File	Executes as	Cache scope
`revenue_by_destination.sql`	Service principal	Shared across all users
`revenue_by_destination.obo.sql`	Logged-in user	Per-user cache

For a shared dashboard, service principal queries (the default) are what you want — one cache entry serves everyone.

Step 5: Build the React frontend

Fetch all five client components:

```

for file in RevenueByDestination.tsx RevenueChart.tsx BookingManager.tsx WanderbricksChat.tsx App.tsx; do curl -sL "https://github.com/databricks/devhub/tree/main/examples/vacation-rentals/blog-post-snippets/client/s... ${file}"
-o "client/src/${file}" done

```

The five files together make up the UI:

Component: RevenueByDestination.tsx
Purpose: Table of revenue by destination
Notable hook: useAnalyticsQuery("revenue_by_destination", …)
────────────────────────────────────────
Component: RevenueChart.tsx
Purpose: Bar chart over the same query ────────────────────────────────────────
Component: BookingManager.tsx
Purpose: Look up a booking, flag it, add notes
Notable hook: useAnalyticsQuery("booking_detail", …) + fetch() to Lakebase routes ────────────────────────────────────────
Component: WanderbricksChat.tsx
Purpose: Genie chat panel
────────────────────────────────────────
Component: App.tsx
Purpose: Two-column grid of the four panels (replaces the scaffolded App.tsx)
Notable hook: –

Step 6: Run it

```bash

npm run dev

```

This starts the AppKit backend (with hot-reload via `tsx watch`) and the Vite dev server with HMR. Open `http://localhost:8000` and you'll see your app querying live data from your SQL Warehouse, with flags and notes persisted in Lakebase.

Step 7: Deploy to Databricks

From inside wanderbricks-ops/:

```

databricks apps deploy

```

This runs the full pipeline: validates the project (build, typecheck, lint), uploads it to your workspace, and starts the app.

Once the deploy finishes, fetch the running URL:

```

databricks apps get wanderbricks-ops --output json | jq -r '.url'

```

In production, AppKit serves the built React app as static files and connects to your workspace using the service principal configured in your app deployment. The platform auto-injects PGHOST, PGUSER, PGDATABASE, PGPORT, and PGSSLMODE from your postgres Apps resource — only LAKEBASE_ENDPOINT carries over from local .env.

Summary

In roughly 200 lines of application code, you built an app that:

Queries a SQL Warehouse for analytics with parameterized, cached SQL
Looks up individual bookings with guest and property details
Lets users flag bookings and add notes, persisted in Lakebase
Renders interactive charts from live data
Gives users a conversational AI interface to explore the data
Streams results over SSE (no polling, no WebSockets to manage)
Auto-creates its own database tables on first startup
Runs locally with hot reload and deploys to Databricks with one command

AppKit's plugin system handled: writing an Express route handler for SSE, a SQL connector, a caching layer, a Postgres connection manager with token rotation, and a streaming implementation.

Where to go next

More queries — drop `.sql` files in `config/queries/` and they're immediately available via `useAnalyticsQuery`
Per-user queries — rename a file to `.obo.sql` and it executes as the logged-in user instead of the service principal
Multiple Genie spaces — add more aliases in the `genie({ spaces: { ... } })` config to let users explore different data domains
Synced tables — for sub-10ms lookups on high-traffic data, sync warehouse tables into Lakebase
File attachments — add the `files()` plugin to upload and serve documents from Unity Catalog Volumes
Type generation — run `npx @DataBricks/appkit generate-types` to get full TypeScript autocomplete on query parameters and results
Custom plugins — when the built-in plugins aren't enough, scaffold your own with `npx @DataBricks/appkit plugin create`

AppKit is an open-source Node.js + React SDK for Databricks Apps. GitHub · Docs

Cleanup

A note on cost

Three of the resources you just created are billed while they exist: the SQL Warehouse, the Lakebase Autoscaling project, and the deployed App. Idle costs are minimal — the warehouse auto-stops after a few minutes of no queries, and Lakebase Autoscaling suspends to zero compute when there's no load. If you're done experimenting, the Cleanup section below shows how to remove all three.

Run from inside wanderbricks-ops/:

1. Tear down the deployed app via the bundle

```databricks bundle destroy```

2. Trash the Genie space

```set -a; source .env; set +a databricks genie trash-space "$DATABRICKS_GENIE_SPACE_ID"```

3. Delete the Lakebase project (also removes the branch, endpoint, and database)

```databricks postgres delete-project appkit-dev```

4. Optional: delete the SQL warehouse if configure_env.sh auto-created it

(skip this if you reused an existing warehouse — the script only creates a

new one when no Pro serverless warehouse exists)

```databricks warehouses delete "$DATABRICKS_WAREHOUSE_ID"```

Deleting the local wanderbricks-ops/ directory is independent - the Databricks resources are tracked separately and stay billed until removed via the commands above.

Troubleshooting

The scripts print contextual error blocks at the points they're most likely to fail. If something else goes wrong, in roughly the order you'd hit it:

bash configure_env.sh fails to create a resource. Read the printed error block — it lists common causes for the specific step that failed (warehouse, Genie space, Lakebase project, or endpoint). Most failures are workspace-level: the feature isn't enabled, or you don't have create permission. Fix the underlying issue (often an admin ask) and re-run.

databricks apps init exits with "missing required resource ... use --set ...". Step 2 didn't populate everything Step 3 expects. Check .env has DATABRICKS_WAREHOUSE_ID, DATABRICKS_GENIE_SPACE_ID, LAKEBASE_BRANCH, and LAKEBASE_DATABASE. Re-run configure_env.sh if any are missing.

npm run dev errors with default auth: cannot configure default credentials. The Databricks SDK can't find a working profile. Run databricks auth login, or export DATABRICKS_CONFIG_PROFILE=<name> if you have multiple profiles, then restart the dev server.

npm run dev errors with missing PGUSER or DATABRICKS_CLIENT_ID. The Lakebase plugin can't figure out the postgres user. Set DATABRICKS_CLIENT_ID (your service principal client ID) or PGUSER (your postgres role) in .env. In production deploy, the platform auto-injects this — only local dev needs it explicitly.

Browser shows a blank page or "Cannot GET /". Vite is on a different port, or the backend crashed. Check the terminal — AppKit logs listening on http://localhost:8000 when ready. If port 8000 is taken, set DATABRICKS_APP_PORT=8001 in .env.

databricks apps deploy fails on validation. AppKit projects fail-fast on type or build errors before uploading. Run npm run build and npm run typecheck to see which one breaks.

databricks apps get wanderbricks-ops returns no URL. The deploy is still in progress. Wait ~30 seconds and re-run — the URL is populated once the first start completes.

Databricks Community

Step-by-Step: Building a Vacation Rental Operations App with AppKit

Intro

What we're building

Architecture

The fast path: hand it to a coding agent

Before you start

Step 1: Scaffold the project

Step 2: Configure your workspace connection

Step 3: Set up the server

Step 4: Add the analytics queries

Step 5: Build the React frontend

Step 6: Run it

Step 7: Deploy to Databricks

Summary

Where to go next

Cleanup

A note on cost

1. Tear down the deployed app via the bundle

2. Trash the Genie space

3. Delete the Lakebase project (also removes the branch, endpoint, and database)

4. Optional: delete the SQL warehouse if configure_env.sh auto-created it

Troubleshooting

Metadata-Driven ETL Framework in Databricks (Part-1)

Top 10 query performance tuning tips for Databricks Serverless SQL

Best practices for safe data experimentation with Databricks