cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
MohanaBasak
Databricks Employee
Databricks Employee

Your Own GeoGenie: Natural Language Geospatial Analytics on Databricks

Every enterprise has geospatial data. Telecom companies track tower locations. Retailers map store footprints. Logistics networks route around real-world geography. But when a business user asks a simple location-based question — which sites in this region generate the highest revenue? — the answer almost always requires an engineer, a handful of SQL queries, and a turnaround measured in hours.

The gap is not in storage or compute. Databricks handles both. The gap is in the last mile: turning a spatial question into a spatial query.

The problem with spatial SQL today

Consider a retail planner who wants to know which stores within a target region are underperforming. Today, that question requires:

  • Knowing the table schema, column names, and which columns hold coordinates
  • Writing SQL with spatial functions like ST_Intersects and ST_GeomFromWKT
  • Manually constructing WKT geometry strings to define the area of interest
  • Iterating through multiple queries to narrow down the right region

For engineers, this is tedious. For business users, it is a wall. The result is that most geospatial data sits underused — not because the platform cannot handle it, but because the interface demands too much expertise.

Summon GeoGenie: draw a region, ask a question, get an answer

GeoGenie is an open-source reference app that shows how Databricks Genie Space and Databricks Apps can be combined to reduce this friction. The workflow has three steps:

  1. Explore a 3D interactive globe with site locations plotted as beacons. Click any site to inspect details — name, tenant, revenue, city, image.
  2. Draw a polygon or rectangle directly on the map to define a region of interest.
  3. Ask a question in plain English. GeoGenie handles the rest.

Behind the scenes, when a user draws a shape, GeoGenie converts the polygon coordinates into WKT and appends that geometry to the prompt sent to Databricks Genie. Genie generates the SQL with the spatial filter already built in:

WITH ranked_sites AS (
  SELECT *,
    RANK() OVER (ORDER BY total_monthly_revenue DESC) AS revenue_rank
  FROM catalog.geogenie.site_locations
  WHERE ST_Intersects(
    ST_GeomFromWKT('POLYGON((-102.3 26.8, -95.2 26.8, -95.2 31.4, -102.3 31.4, -102.3 26.8))'),
    ST_Point(longitude, latitude)
  )
)
SELECT site_name, city, state, tenant_name, total_monthly_revenue
FROM ranked_sites
WHERE revenue_rank <= 5
ORDER BY total_monthly_revenue DESC;

The Genie Space is pre-configured with spatial-query instructions, so the generated SQL consistently uses ST_Intersects, constructs points with longitude first in ST_Point, and applies the drawn region as a spatial predicate automatically. The user focuses on the question. GeoGenie handles the spatial context and query generation.

How it works

MohanaBasak_0-1773851159799.png

GeoGenie combines four capabilities in the Databricks platform into a single experience:

Layer

Technology

Frontend

CesiumJS 3D globe + Streamlit

AI / NL-to-SQL

Databricks Genie Space

Data governance

Unity Catalog + SQL Warehouse

App platform

Databricks Apps

The Cesium globe runs inside an iframe rendered through Streamlit's custom component API and communicates with the Python backend via window.postMessage. This lets GeoGenie pair a rich JavaScript geospatial UI with simple Streamlit application logic — all deployed and authenticated through Databricks Apps with a service principal.

Setup is a single notebook. Clone the repo, run setup_and_deploy, and the notebook provisions everything: the Unity Catalog objects, synthetic sample data, a configured Genie Space with spatial instructions, and the deployed app with the right permissions. After setup, the app is ready to share with your team.

Three Wishes: What this unlocks?

The way GeoGenie approaches map interactivity when generating Genie queries is important for three key reasons:

New personas get access to spatial analytics. A business analyst who cannot write ST_Intersects can now draw a region and ask a question. The barrier drops from "knows spatial SQL" to "can point at a map."

Governed by default. Every query runs through Unity Catalog. A business user gets spatial answers without being granted direct table access. The Genie Space controls which tables and columns are exposed. This is not a shortcut around governance — it is governance made usable.

A pattern, not just a demo. The architecture — visual interaction layer, natural language interface, governed data — is reusable. Telecom network planning, retail site intelligence, logistics coverage analysis, field asset management: any domain where location is a first-class dimension of the data can follow this pattern.

Try It

The repository is open source: https://github.com/databricks-solutions/genie-geo-chat

If you work with geospatial data on Databricks, use it as a starting point and adapt it to your own use case. Open an issue if you run into problems, and share what this inspires you to build. The combination of spatial SQL, natural language, and governed data access is still early — and there is a lot of room to push it further.

 

3 Comments
josh_melton
Databricks Employee
Databricks Employee

Super cool!!!

Ramana
Valued Contributor II

First off, I really enjoyed exploring this app â€” it's a great starting point, and I personally love the idea of visualizing geospatial data on a 3D pane. The concept and execution are genuinely impressive, and I can see a lot of potential value for customers wanting to bring location intelligence into Genie conversations. 

I wanted to share a piece of feedback from the perspective of a Citizen Developer, since I think it's worth considering as the project evolves

When I deployed the app to my personal workspace, everything worked beautifully — which makes sense, since I'm the only user and there are no other permissions at play. However, when I tried deploying it to an Enterprise workspace by simply following the README (without first reviewing the code in detail), I noticed that the deployment unintentionally overwrote all existing permissions on the SQL Warehouse. For shared enterprise environments where multiple teams and service principals already rely on the warehouse, this can be quite disruptive. 

The root cause appears to be in setup_and_deploy.py (around lines 426–433 in https://github.com/databricks-solutions/genie-geo-chat/blob/main/setup_and_deploy.py), where w.warehouses.set_permissions(...) is used. The set_permissions call replaces the entire ACL rather than appending to it, so any pre-existing grants (other users, SPs, groups) get wiped out. The same pattern may apply to a few other permission calls in that section as well. 

Would you mind taking another look at the security/permissions setup when you get a chance? Ideally, the deployment should add or update permissions for the required service principal without revoking anything that's already in place. A safer pattern would be to either:

  1. Use w.warehouses.update_permissions(...) (which patches the ACL incrementally)or
  1. Fetch the existing ACL first, append the app SP's entry, and pass the full merged list to set_permissions.

A similar review of the w.permissions.update(...) call for the Genie Space would also be worthwhile, just to be safe

Totally understand this is a solution accelerator and not a hardened product — just flagging it because Citizen Developers (myself included) tend to follow the README verbatim, and a small change here would make the app much safer to drop into enterprise workspaces. 

Thanks again for putting this together — really looking forward to seeing where it goes.

Ramana_0-1776808436728.png

 

Ramana_1-1776808445675.png

 

MohanaBasak
Databricks Employee
Databricks Employee

Hi @Ramana, thanks for letting me know. I have updated the notebook to use update_permissions(). I also tested the Genie permissions, and since it already uses update(), I don't see it overwriting existing ones. Thanks again for using this and bringing this to my attention.