4 weeks ago
For one of my client, we are building the entire semantic layer for Data (BI) reporting who uses Snowflake. Is there a way to quickly integrate AI/BI Genie on top of Snowflake ?
4 weeks ago
Hey @TJ-Leap-Forward ,
Yes — you can stand up Databricks AI/BI Genie on top of Snowflake quickly by federating Snowflake into Unity Catalog and then building Genie spaces over those governed datasets, without migrating data out of Snowflake. This works because Genie operates on Unity Catalog–registered data, including foreign (federated) tables and views.
Genie spaces use Unity Catalog metadata and author-provided instructions to translate natural language into SQL over your governed datasets, including foreign tables and metric views.
Lakehouse Federation natively supports Snowflake, so you can register Snowflake databases in Unity Catalog and query them from Databricks (read-only), with pushdowns for common operations to Snowflake for performance.
You can choose between two federation modes:
Governance is unified: Unity Catalog handles permissions, lineage, and auditing on federated Snowflake tables, and Genie respects those permissions when answering questions.
Prereqs
Create a Snowflake security integration (for OAuth) in Snowflake:
CREATE SECURITY INTEGRATION aibi_uc_oauth
TYPE = oauth
ENABLED = TRUE
OAUTH_CLIENT = custom
OAUTH_CLIENT_TYPE = 'CONFIDENTIAL'
OAUTH_REDIRECT_URI = 'https://<your-workspace-url>/login/oauth/snowflake.html'
OAUTH_ISSUE_REFRESH_TOKENS = TRUE
OAUTH_REFRESH_TOKEN_VALIDITY = 7776000 -- 90 days
OAUTH_ENFORCE_PKCE = TRUE;
Create a Unity Catalog connection to Snowflake (UI or SQL):
CREATE CONNECTION sf_conn
TYPE SNOWFLAKE
OPTIONS (
host 'your-acct.region.cloud.snowflakecomputing.com',
port '443',
auth_type 'oauth',
client_id '<from SYSTEM$SHOW_OAUTH_CLIENT_SECRETS>',
client_secret '<from SYSTEM$SHOW_OAUTH_CLIENT_SECRETS>',
oauth_scope 'refresh_token session:role:<ROLE>',
user 'SNOWFLAKE_USER'
);
Create a foreign catalog that mirrors your Snowflake database:
CREATE FOREIGN CATALOG snowflake_sales
USING CONNECTION sf_conn
OPTIONS (database 'SALES_DB');
Grant access in Unity Catalog (catalog/schema/table as needed) and test a query from Databricks SQL:
SELECT * FROM snowflake_sales.public.orders LIMIT 10;
Build your Genie space
Query federation is ideal for fast proof-of-concept access; Databricks pushes filters/joins/aggregations down to Snowflake via JDBC to keep latency low.
Catalog federation (Iceberg) lets Databricks read Iceberg tables in your Snowflake catalog directly from cloud object storage, using Databricks compute for better cost/perf. Configure an external location for Iceberg paths; Databricks auto-detects Iceberg tables and falls back to query federation if a table isn’t eligible
For frequently queried Snowflake sources, consider materialized views in Databricks to stabilize concurrency and accelerate cross-source joins, while retaining UC governance.
Read-only: Federated Snowflake tables are read-only from Databricks; writes remain in Snowflake or via pipelines into managed UC tables.
Case sensitivity: Preserve case for Snowflake identifiers by quoting the database/schema/table names where needed.
Limits and sizing: Large result sets can pressure single-stream JDBC reads; design queries/filters accordingly and leverage metric views or materialized views for heavy workloads.
Governance: Unity Catalog enforces permissions and lineage across federated data; Genie uses UC metadata and your semantic annotations to improve answer quality.
Hope this helps, Louis.
4 weeks ago
Hey @TJ-Leap-Forward ,
Yes — you can stand up Databricks AI/BI Genie on top of Snowflake quickly by federating Snowflake into Unity Catalog and then building Genie spaces over those governed datasets, without migrating data out of Snowflake. This works because Genie operates on Unity Catalog–registered data, including foreign (federated) tables and views.
Genie spaces use Unity Catalog metadata and author-provided instructions to translate natural language into SQL over your governed datasets, including foreign tables and metric views.
Lakehouse Federation natively supports Snowflake, so you can register Snowflake databases in Unity Catalog and query them from Databricks (read-only), with pushdowns for common operations to Snowflake for performance.
You can choose between two federation modes:
Governance is unified: Unity Catalog handles permissions, lineage, and auditing on federated Snowflake tables, and Genie respects those permissions when answering questions.
Prereqs
Create a Snowflake security integration (for OAuth) in Snowflake:
CREATE SECURITY INTEGRATION aibi_uc_oauth
TYPE = oauth
ENABLED = TRUE
OAUTH_CLIENT = custom
OAUTH_CLIENT_TYPE = 'CONFIDENTIAL'
OAUTH_REDIRECT_URI = 'https://<your-workspace-url>/login/oauth/snowflake.html'
OAUTH_ISSUE_REFRESH_TOKENS = TRUE
OAUTH_REFRESH_TOKEN_VALIDITY = 7776000 -- 90 days
OAUTH_ENFORCE_PKCE = TRUE;
Create a Unity Catalog connection to Snowflake (UI or SQL):
CREATE CONNECTION sf_conn
TYPE SNOWFLAKE
OPTIONS (
host 'your-acct.region.cloud.snowflakecomputing.com',
port '443',
auth_type 'oauth',
client_id '<from SYSTEM$SHOW_OAUTH_CLIENT_SECRETS>',
client_secret '<from SYSTEM$SHOW_OAUTH_CLIENT_SECRETS>',
oauth_scope 'refresh_token session:role:<ROLE>',
user 'SNOWFLAKE_USER'
);
Create a foreign catalog that mirrors your Snowflake database:
CREATE FOREIGN CATALOG snowflake_sales
USING CONNECTION sf_conn
OPTIONS (database 'SALES_DB');
Grant access in Unity Catalog (catalog/schema/table as needed) and test a query from Databricks SQL:
SELECT * FROM snowflake_sales.public.orders LIMIT 10;
Build your Genie space
Query federation is ideal for fast proof-of-concept access; Databricks pushes filters/joins/aggregations down to Snowflake via JDBC to keep latency low.
Catalog federation (Iceberg) lets Databricks read Iceberg tables in your Snowflake catalog directly from cloud object storage, using Databricks compute for better cost/perf. Configure an external location for Iceberg paths; Databricks auto-detects Iceberg tables and falls back to query federation if a table isn’t eligible
For frequently queried Snowflake sources, consider materialized views in Databricks to stabilize concurrency and accelerate cross-source joins, while retaining UC governance.
Read-only: Federated Snowflake tables are read-only from Databricks; writes remain in Snowflake or via pipelines into managed UC tables.
Case sensitivity: Preserve case for Snowflake identifiers by quoting the database/schema/table names where needed.
Limits and sizing: Large result sets can pressure single-stream JDBC reads; design queries/filters accordingly and leverage metric views or materialized views for heavy workloads.
Governance: Unity Catalog enforces permissions and lineage across federated data; Genie uses UC metadata and your semantic annotations to improve answer quality.
Hope this helps, Louis.
yesterday
Thank you, Louis, for providing all the details — they were extremely helpful. To give you some context, we are an IT consulting and software services company, and we recently became a registered partner. This area is still fairly new to me.
We are currently exploring a Proof of Concept for one of our customers who is building a comprehensive semantic layer with live reporting in Power BI. We would like to demonstrate how AI/BI Genie can add significant value and enhance their capabilities. Could you please guide me on the best approach to get started? Are there any associated costs that we should factor in? Additionally, is there someone else we should connect with for further assistance?
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now