topic Re: Extract Snowflake data based on environment in Data Engineering

Extract Snowflake data based on environment

ndw — Mon, 15 Dec 2025 17:41:17 GMT

Hi all,

In the development workspace, I need to extract data from a table/view in Snowflake development environment. Example table is called as VD_DWH.SALES.SALES_DETAIL

When we deploy the code into production, it needs to extract data from a table/view in Snowflake production environment. Example table is called as VP_DWH.SALES.SALES_DETAIL

On brief, for dev database name is VD_DWH

for production database name is VP_DWH

How to implement this in Databricks?

Re: Extract Snowflake data based on environment

nayan_wylde — Mon, 15 Dec 2025 18:02:31 GMT

@ndw Can you please share more details on the request. Are you trying to extract data from snowflake in databricks?

Re: Extract Snowflake data based on environment

ndw — Mon, 15 Dec 2025 19:19:31 GMT

There is a migration from the legacy data warehouse Snowflake to the lakehouse in the Databricks. Until the migration has completed, I need to transfer Snowflake data into Databricks. but dev and prod snowflake databases have different names.

In the development environment, data migration source is VD_DWH.SALES.SALES_DETAIL; in the production environment the source is VP_DWH.SALES.SALES_DETAIL

According to the environment I need to replace source table in the select statement

Re: Extract Snowflake data based on environment

nayan_wylde — Mon, 15 Dec 2025 19:56:29 GMT

Option 1 — Notebook widgets (simple & reliable)

Use a Databricks widget to pass the environment (e.g., via Jobs/Workflows parameter), then build the table name dynamically.

# Databricks notebook dbutils.widgets.dropdown("env", "dev", ["dev", "prod"], "Environment") env = dbutils.widgets.get("env") # Map env → Snowflake database db_map = {"dev": "VD_DWH", "prod": "VP_DWH"} source_db = db_map.get(env, "VD_DWH") # default to dev source_table_fqn = f"{source_db}.SALES.SALES_DETAIL" # Snowflake connection options (store secrets in Databricks Secrets) sfOptions = { "sfUrl": dbutils.secrets.get("scope", "sfUrl"), "sfUser": dbutils.secrets.get("scope", "sfUser"), "sfPassword": dbutils.secrets.get("scope", "sfPassword"), "sfWarehouse": dbutils.secrets.get("scope", "sfWarehouse"), "sfRole": dbutils.secrets.get("scope", "sfRole"), } # Read from Snowflake table into Spark DataFrame df = (spark.read .format("snowflake") .options(**sfOptions) .option("dbtable", source_table_fqn) .load()) # Do your transforms, then write to Delta/UC df.write.mode("overwrite").saveAsTable("lakehouse.sales_detail") ``

Option 2 — Jobs/Workflows parameters (no widgets in code)

If you’re scheduling via Databricks Jobs (or your orchestrator like ADF/Airflow), pass env as a parameter and read it with dbutils.widgets.get. You can also pass the database name directly as a parameter to avoid mapping.

Option 3 — Configuration-driven (robust for many tables)

Centralize environment settings (e.g., in a UC Volume/Delta table or notebook-scoped JSON) and look them up

import json config_json = """ { "dev": {"source_db": "VD_DWH"}, "prod": {"source_db": "VP_DWH"} } """ cfg = json.loads(config_json) env = dbutils.widgets.get("env") # or default 'dev' source_db = cfg[env]["source_db"] source_table_fqn = f"{source_db}.SALES.SALES_DETAIL" # ... proceed to read from Snowflake

Re: Extract Snowflake data based on environment

ndw — Mon, 15 Dec 2025 20:23:29 GMT

Thank you

I will schedule via Databricks Job. I need to pass env as dev in the development and prod in the production. How can I pass different values with the same job in different environments? Should I use DABs for passing env?

Re: Extract Snowflake data based on environment

nayan_wylde — Tue, 16 Dec 2025 18:01:37 GMT

Create a single job that runs your migration notebook.
In the job settings, under Parameters, add a key like env with a default value (e.g., dev).
When you create different job runs (or schedule them), override the parameter:

For development runs, set env = dev.
For production runs, set env = prod.

env = dbutils.widgets.get("env") source_db = "VP_DWH" if env == "prod" else "VD_DWH" source_table_fqn = f"{source_db}.SALES.SALES_DETAIL"

Yes, DABs are a great option if:

You want environment-specific configurations (dev, staging, prod) in a single deployment artifact.
You need CI/CD integration for Databricks Jobs.

With DABs:

Define job parameters per environment in bundle.yml:

environments: dev: jobs: migrate-job: parameters: env: "dev" prod: jobs: migrate-job migrate-job: parameters: env: "prod"