cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Tag dynamic allpurpose

pargit
New Contributor

hi..

I want to use  1 all purpose cluster   and use dynamic taging in each notebook.

for example tag  project  and department.

I want to be able to change the tag from the notebook  so I will be able to understand the costs for each project and department.

I tried spark.conf.set(spark.databricks.clusterusagetags.dep,sc)

it didn't change though when I ran my pipeline.

thanks

 

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @pargit , 

Why Your Approach Isn't Working

Cluster usage tags cannot be dynamically modified at runtime from within a notebook. The `spark.databricks.clusterUsageTags.` configurations are read-only properties set when the cluster is created or configured, and `spark.conf.set()` cannot modify them during execution.

When you use `spark.conf.get("spark.databricks.clusterUsageTags.clusterAllTags")`, you can read the current tags, but attempting to set them with `spark.conf.set()` has no effect because these are cluster-level configurations that are immutable once the cluster is running.

Alternative Solutions for Cost Tracking

Use Job Clusters with Different Tags


Instead of a single all-purpose cluster, create job-specific clusters where each job can have custom tags for `project` and `department`. This allows granular cost attribution through the Clusters API when defining job configurations.

Serverless Budget Policies


If you're using serverless compute (Public Preview), you can use serverless budget policies to automatically tag usage at the user or group level. When users are assigned different policies, their usage is automatically tagged with the policy's custom tags.

API-Based Cluster Management


Programmatically update cluster tags using the Databricks Clusters API before running workloads:
- Call the API to update cluster configuration with new tags
- Restart the cluster (if needed)
- Run your notebook with the updated tags

System Tables for Cost Analysis


Use the `system.billing.usage` table to track costs. While you can't change cluster tags dynamically, you can add metadata tracking within your notebooks (logging project/department info to a table) and join this with billing data for cost attribution.

The fundamental limitation is that cluster tags are designed to be set at the infrastructure level, not modified during runtime execution.

Hope this helps, Louis.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now