- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
4 weeks ago
Hi @Nkrom ,
I am happy to share the Azure REST API method! Using the Azure Python SDK is the absolute fastest way to do this but you can choose any other programming language.
ADLS Gen2 uses a "Hierarchical Namespace" (HNS). When you use the Azure SDK to rename a folder, it doesn't touch the files inside. It literally just updates the folder name in the metadata layer. What takes dbutils 7 hours will take this API about 2 seconds.
Here is how you do it in a Databricks Notebook:
First, you need to install the Azure Data Lake library. You can run this in the first cell of your notebook:
%pip install azure-storage-file-datalakeNext, use this script. Important: Never hardcode your storage key in the notebook. Always use dbutils.secrets.get() to pull it securely from your Databricks Key Vault! If not you can directly use the key.
from azure.storage.filedatalake import DataLakeServiceClient
# 1. Setup your credentials securely
storage_account = "<your_storage_account_name>"
container = "<your_container_name>"
# Pull the storage key from Databricks Secrets
storage_key = dbutils.secrets.get(scope="your_scope_name", key="your_secret_name")
# Create the client connection
service_client = DataLakeServiceClient(
account_url=f"https://{storage_account}.dfs.core.windows.net",
credential=storage_key
)
file_system_client = service_client.get_file_system_client(file_system=container)
# 2. Get clients for your current directories
dir_customer = file_system_client.get_directory_client("customer")
dir_customer_01 = file_system_client.get_directory_client("customer_01")
# 3. Perform the atomic swap (This happens instantly!)
# Note: The new_name parameter requires the container name in the path
dir_customer.rename_directory(new_name=f"{container}/customer_temp")
dir_customer_01.rename_directory(new_name=f"{container}/customer")
# Get the temp folder and rename it to customer_01
dir_temp = file_system_client.get_directory_client("customer_temp")
dir_temp.rename_directory(new_name=f"{container}/customer_01")
print("Folders swapped instantly via Azure API!")Quick note: If your company uses Service Principals (Managed Identities / Entra ID) instead of Storage Account Keys, you can just install the azure-identity library and swap the credential=storage_key out for credential=DefaultAzureCredential().
Give this a try in your workflow, it will save you a massive amount of cluster compute time! Let us know how it goes