- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-12-2024 03:45 AM
Hi all,
How can i move my delta tables from Dev Workspace to Prod Workspace.
Is there any dynamic logic code in python to do it?
- Labels:
-
Spark
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-09-2024 12:11 AM
To move Delta tables from a Dev workspace to a Prod workspace, you can use a combination of Delta Lake features and Databricks APIs. Here's a high-level approach with some Python code to help you get started:
Method 1: Using CLONE command
For smaller tables or when both workspaces can access the same storage:
from delta.tables import *
# Connect to Dev workspace
dev_spark = # Initialize SparkSession for Dev workspace
# Connect to Prod workspace
prod_spark = # Initialize SparkSession for Prod workspace
def clone_table(table_name):
# Clone the table from Dev to Prod
dev_table = DeltaTable.forName(dev_spark, f"dev_database.{table_name}")
dev_table.clone(
target=f"prod_database.{table_name}",
isShallow=False,
replace=True
)
# Get list of tables to clone
tables_to_clone = dev_spark.sql("SHOW TABLES IN dev_database").select("tableName").collect()
# Clone each table
for row in tables_to_clone:
clone_table(row.tableName)
Method 2: Using external storage
For larger tables or when workspaces can't directly access each other's storage:
- Export data to external storage:
def export_table(table_name, external_path): df = dev_spark.table(f"dev_database.{table_name}") df.write.format("delta").mode("overwrite").save(f"{external_path}/{table_name}") # Export all tables external_path = "abfss://your-container@your-storage-account.dfs.core.windows.net/export" tables_to_export = dev_spark.sql("SHOW TABLES IN dev_database").select("tableName").collect() for row in tables_to_export: export_table(row.tableName, external_path)
- Import data in Prod workspace:
def import_table(table_name, external_path): df = prod_spark.read.format("delta").load(f"{external_path}/{table_name}") df.write.format("delta").mode("overwrite").saveAsTable(f"prod_database.{table_name}") # Import all tables for row in tables_to_export: import_table(row.tableName, external_path)
- Import data in Prod workspace:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-09-2024 12:11 AM
To move Delta tables from a Dev workspace to a Prod workspace, you can use a combination of Delta Lake features and Databricks APIs. Here's a high-level approach with some Python code to help you get started:
Method 1: Using CLONE command
For smaller tables or when both workspaces can access the same storage:
from delta.tables import *
# Connect to Dev workspace
dev_spark = # Initialize SparkSession for Dev workspace
# Connect to Prod workspace
prod_spark = # Initialize SparkSession for Prod workspace
def clone_table(table_name):
# Clone the table from Dev to Prod
dev_table = DeltaTable.forName(dev_spark, f"dev_database.{table_name}")
dev_table.clone(
target=f"prod_database.{table_name}",
isShallow=False,
replace=True
)
# Get list of tables to clone
tables_to_clone = dev_spark.sql("SHOW TABLES IN dev_database").select("tableName").collect()
# Clone each table
for row in tables_to_clone:
clone_table(row.tableName)
Method 2: Using external storage
For larger tables or when workspaces can't directly access each other's storage:
- Export data to external storage:
def export_table(table_name, external_path): df = dev_spark.table(f"dev_database.{table_name}") df.write.format("delta").mode("overwrite").save(f"{external_path}/{table_name}") # Export all tables external_path = "abfss://your-container@your-storage-account.dfs.core.windows.net/export" tables_to_export = dev_spark.sql("SHOW TABLES IN dev_database").select("tableName").collect() for row in tables_to_export: export_table(row.tableName, external_path)
- Import data in Prod workspace:
def import_table(table_name, external_path): df = prod_spark.read.format("delta").load(f"{external_path}/{table_name}") df.write.format("delta").mode("overwrite").saveAsTable(f"prod_database.{table_name}") # Import all tables for row in tables_to_export: import_table(row.tableName, external_path)
- Import data in Prod workspace:

