cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

AI/BI Genie for Snowflake

TJ-Leap-Forward
New Contributor

For one of my client, we are building the entire semantic layer for Data (BI) reporting who uses Snowflake. Is there a way to quickly integrate AI/BI Genie on top of Snowflake ?

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Hey @TJ-Leap-Forward , 

Yes โ€” you can stand up Databricks AI/BI Genie on top of Snowflake quickly by federating Snowflake into Unity Catalog and then building Genie spaces over those governed datasets, without migrating data out of Snowflake. This works because Genie operates on Unity Catalogโ€“registered data, including foreign (federated) tables and views.

What works and why

  • Genie spaces use Unity Catalog metadata and author-provided instructions to translate natural language into SQL over your governed datasets, including foreign tables and metric views.

  • Lakehouse Federation natively supports Snowflake, so you can register Snowflake databases in Unity Catalog and query them from Databricks (read-only), with pushdowns for common operations to Snowflake for performance.

  • You can choose between two federation modes:

    • Query federation (JDBC pushdown to Snowflake compute) for fast, on-demand access.
    • Catalog federation (direct reads of Iceberg tables in object storage using Databricks compute) for better cost/performance when your Snowflake data is stored as Iceberg.
      Both options are supported for Snowflake connections in Unity Catalog.
  • Governance is unified: Unity Catalog handles permissions, lineage, and auditing on federated Snowflake tables, and Genie respects those permissions when answering questions.

     

Fastest path: 30โ€“60 minute setup

  1. Prereqs

    • Workspace is enabled for Unity Catalog; network connectivity to Snowflake; SQL Warehouse Pro/Serverless (2023.40+); Databricks Runtime 13.3 LTS+ for compute.
  2. Create a Snowflake security integration (for OAuth) in Snowflake:

    CREATE SECURITY INTEGRATION aibi_uc_oauth
      TYPE = oauth
      ENABLED = TRUE
      OAUTH_CLIENT = custom
      OAUTH_CLIENT_TYPE = 'CONFIDENTIAL'
      OAUTH_REDIRECT_URI = 'https://<your-workspace-url>/login/oauth/snowflake.html'
      OAUTH_ISSUE_REFRESH_TOKENS = TRUE
      OAUTH_REFRESH_TOKEN_VALIDITY = 7776000  -- 90 days
      OAUTH_ENFORCE_PKCE = TRUE;
  3. Create a Unity Catalog connection to Snowflake (UI or SQL):

    CREATE CONNECTION sf_conn
    TYPE SNOWFLAKE
    OPTIONS (
      host 'your-acct.region.cloud.snowflakecomputing.com',
      port '443',
      auth_type 'oauth',
      client_id '<from SYSTEM$SHOW_OAUTH_CLIENT_SECRETS>',
      client_secret '<from SYSTEM$SHOW_OAUTH_CLIENT_SECRETS>',
      oauth_scope 'refresh_token session:role:<ROLE>',
      user 'SNOWFLAKE_USER'
    );
  4. Create a foreign catalog that mirrors your Snowflake database:

    CREATE FOREIGN CATALOG snowflake_sales
    USING CONNECTION sf_conn
    OPTIONS (database 'SALES_DB');
  5. Grant access in Unity Catalog (catalog/schema/table as needed) and test a query from Databricks SQL:

    SELECT * FROM snowflake_sales.public.orders LIMIT 10;
     
  6. Build your Genie space

    • Select the federated Snowflake tables/views (and any metric views) as datasets.
    • Add example queries, parameters, synonyms, and space instructions to align with your semantic layer.
    • Use โ€œTrusted assetsโ€ (parameterized examples/functions) to mark vetted answers as trusted.
    • Publish to business users (Consumer Access) and monitor with the admin tools.

Performance and architecture options

  • Query federation is ideal for fast proof-of-concept access; Databricks pushes filters/joins/aggregations down to Snowflake via JDBC to keep latency low.

  • Catalog federation (Iceberg) lets Databricks read Iceberg tables in your Snowflake catalog directly from cloud object storage, using Databricks compute for better cost/perf. Configure an external location for Iceberg paths; Databricks auto-detects Iceberg tables and falls back to query federation if a table isnโ€™t eligible

  • For frequently queried Snowflake sources, consider materialized views in Databricks to stabilize concurrency and accelerate cross-source joins, while retaining UC governance.

Important considerations

  • Read-only: Federated Snowflake tables are read-only from Databricks; writes remain in Snowflake or via pipelines into managed UC tables.

  • Case sensitivity: Preserve case for Snowflake identifiers by quoting the database/schema/table names where needed.

  • Limits and sizing: Large result sets can pressure single-stream JDBC reads; design queries/filters accordingly and leverage metric views or materialized views for heavy workloads.

  • Governance: Unity Catalog enforces permissions and lineage across federated data; Genie uses UC metadata and your semantic annotations to improve answer quality.

 

Hope this helps, Louis.

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now