Every enterprise has geospatial data. Telecom companies track tower locations. Retailers map store footprints. Logistics networks route around real-world geography. But when a business user asks a simple location-based question — which sites in this region generate the highest revenue? — the answer almost always requires an engineer, a handful of SQL queries, and a turnaround measured in hours.
The gap is not in storage or compute. Databricks handles both. The gap is in the last mile: turning a spatial question into a spatial query.
Consider a retail planner who wants to know which stores within a target region are underperforming. Today, that question requires:
For engineers, this is tedious. For business users, it is a wall. The result is that most geospatial data sits underused — not because the platform cannot handle it, but because the interface demands too much expertise.
GeoGenie is an open-source reference app that shows how Databricks Genie Space and Databricks Apps can be combined to reduce this friction. The workflow has three steps:
Behind the scenes, when a user draws a shape, GeoGenie converts the polygon coordinates into WKT and appends that geometry to the prompt sent to Databricks Genie. Genie generates the SQL with the spatial filter already built in:
WITH ranked_sites AS (
SELECT *,
RANK() OVER (ORDER BY total_monthly_revenue DESC) AS revenue_rank
FROM catalog.geogenie.site_locations
WHERE ST_Intersects(
ST_GeomFromWKT('POLYGON((-102.3 26.8, -95.2 26.8, -95.2 31.4, -102.3 31.4, -102.3 26.8))'),
ST_Point(longitude, latitude)
)
)
SELECT site_name, city, state, tenant_name, total_monthly_revenue
FROM ranked_sites
WHERE revenue_rank <= 5
ORDER BY total_monthly_revenue DESC;
The Genie Space is pre-configured with spatial-query instructions, so the generated SQL consistently uses ST_Intersects, constructs points with longitude first in ST_Point, and applies the drawn region as a spatial predicate automatically. The user focuses on the question. GeoGenie handles the spatial context and query generation.
GeoGenie combines four capabilities in the Databricks platform into a single experience:
|
Layer |
Technology |
|
Frontend |
CesiumJS 3D globe + Streamlit |
|
AI / NL-to-SQL |
Databricks Genie Space |
|
Data governance |
Unity Catalog + SQL Warehouse |
|
App platform |
Databricks Apps |
The Cesium globe runs inside an iframe rendered through Streamlit's custom component API and communicates with the Python backend via window.postMessage. This lets GeoGenie pair a rich JavaScript geospatial UI with simple Streamlit application logic — all deployed and authenticated through Databricks Apps with a service principal.
Setup is a single notebook. Clone the repo, run setup_and_deploy, and the notebook provisions everything: the Unity Catalog objects, synthetic sample data, a configured Genie Space with spatial instructions, and the deployed app with the right permissions. After setup, the app is ready to share with your team.
The way GeoGenie approaches map interactivity when generating Genie queries is important for three key reasons:
New personas get access to spatial analytics. A business analyst who cannot write ST_Intersects can now draw a region and ask a question. The barrier drops from "knows spatial SQL" to "can point at a map."
Governed by default. Every query runs through Unity Catalog. A business user gets spatial answers without being granted direct table access. The Genie Space controls which tables and columns are exposed. This is not a shortcut around governance — it is governance made usable.
A pattern, not just a demo. The architecture — visual interaction layer, natural language interface, governed data — is reusable. Telecom network planning, retail site intelligence, logistics coverage analysis, field asset management: any domain where location is a first-class dimension of the data can follow this pattern.
The repository is open source: https://github.com/databricks-solutions/genie-geo-chat
If you work with geospatial data on Databricks, use it as a starting point and adapt it to your own use case. Open an issue if you run into problems, and share what this inspires you to build. The combination of spatial SQL, natural language, and governed data access is still early — and there is a lot of room to push it further.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.