Extract Snowflake data based on environment

ndw
New Contributor III

Hi all,

 

In the development workspace, I need to extract data from a table/view in Snowflake development environment. Example table is called as VD_DWH.SALES.SALES_DETAIL

When we deploy the code into production, it needs to extract data from a table/view in Snowflake production environment. Example table is called as VP_DWH.SALES.SALES_DETAIL

On brief, for dev database name is VD_DWH

for production database name is VP_DWH

How to implement this in Databricks?

 

nayan_wylde
Esteemed Contributor II

@ndw Can you please share more details on the request. Are you trying to extract data from snowflake in databricks?

ndw
New Contributor III

There is a migration from the legacy data warehouse Snowflake to the lakehouse in the Databricks. Until the migration has completed, I need to transfer Snowflake data into Databricks. but dev and prod snowflake databases have different names.

In the development environment, data migration source is VD_DWH.SALES.SALES_DETAIL; in the production environment the source is VP_DWH.SALES.SALES_DETAIL

 

According to the environment I need to replace source table in the select statement 

nayan_wylde
Esteemed Contributor II

Option 1 — Notebook widgets (simple & reliable)

Use a Databricks widget to pass the environment (e.g., via Jobs/Workflows parameter), then build the table name dynamically.

# Databricks notebook
dbutils.widgets.dropdown("env", "dev", ["dev", "prod"], "Environment")
env = dbutils.widgets.get("env")

# Map env → Snowflake database
db_map = {"dev": "VD_DWH", "prod": "VP_DWH"}
source_db = db_map.get(env, "VD_DWH")  # default to dev

source_table_fqn = f"{source_db}.SALES.SALES_DETAIL"

# Snowflake connection options (store secrets in Databricks Secrets)
sfOptions = {
    "sfUrl": dbutils.secrets.get("scope", "sfUrl"),
    "sfUser": dbutils.secrets.get("scope", "sfUser"),
    "sfPassword": dbutils.secrets.get("scope", "sfPassword"),
    "sfWarehouse": dbutils.secrets.get("scope", "sfWarehouse"),
    "sfRole": dbutils.secrets.get("scope", "sfRole"),
}

# Read from Snowflake table into Spark DataFrame
df = (spark.read
      .format("snowflake")
      .options(**sfOptions)
      .option("dbtable", source_table_fqn)
      .load())

# Do your transforms, then write to Delta/UC
df.write.mode("overwrite").saveAsTable("lakehouse.sales_detail")
``

 

Option 2 — Jobs/Workflows parameters (no widgets in code)

If you’re scheduling via Databricks Jobs (or your orchestrator like ADF/Airflow), pass env as a parameter and read it with dbutils.widgets.get. You can also pass the database name directly as a parameter to avoid mapping.

Option 3 — Configuration-driven (robust for many tables)

Centralize environment settings (e.g., in a UC Volume/Delta table or notebook-scoped JSON) and look them up

 

import json

config_json = """
{
  "dev": {"source_db": "VD_DWH"},
  "prod": {"source_db": "VP_DWH"}
}
"""
cfg = json.loads(config_json)
env = dbutils.widgets.get("env")  # or default 'dev'
source_db = cfg[env]["source_db"]

source_table_fqn = f"{source_db}.SALES.SALES_DETAIL"
# ... proceed to read from Snowflake

View solution in original post

ndw
New Contributor III

Thank you

will schedule via Databricks Job. I need to pass env as dev in the development and prod in the production. How can I pass different values with the same job in different environments? Should I use DABs for passing env?

nayan_wylde
Esteemed Contributor II

Create a single job that runs your migration notebook.
In the job settings, under Parameters, add a key like env with a default value (e.g., dev).
When you create different job runs (or schedule them), override the parameter:

For development runs, set env = dev.
For production runs, set env = prod.

env = dbutils.widgets.get("env")
source_db = "VP_DWH" if env == "prod" else "VD_DWH"
source_table_fqn = f"{source_db}.SALES.SALES_DETAIL"
 

Yes, DABs are a great option if:

You want environment-specific configurations (dev, staging, prod) in a single deployment artifact.
You need CI/CD integration for Databricks Jobs.

With DABs:

Define job parameters per environment in bundle.yml:

environments:
  dev:
    jobs:
      migrate-job:
        parameters:
          env: "dev"
  prod:
    jobs:
      migrate-job      migrate-job:
        parameters:
           env: "prod"