- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2025 07:26 AM
You're correct in observing this discrepancy. When a cluster policy is defined and applied through the Databricks UI, fixed environment variables (`spark_env_vars`) specified in the policy automatically propagate to clusters created under that policy. However, when using Terraform, this behavior does not occur automatically due to how the Terraform provider currently handles cluster policies and nested attributes.
Why This Happens
The difference arises from the implementation of the Databricks Terraform provider:
- UI Behavior: The Databricks UI directly enforces policy constraints and propagates fixed values (like `spark_env_vars`) to clusters created under the policy.
- Terraform Behavior: The Terraform provider requires explicit specification of nested attributes like `spark_env_vars` in the cluster definition, even if they are fixed in the policy. The `apply_policy_default_values` parameter only applies default values for top-level attributes, not nested ones.
This is a limitation of the Terraform provider's design, and it has been noted by users in various forums and GitHub issues. The provider does not yet fully replicate the behavior of the Databricks UI when it comes to applying policies, especially for nested configurations like environment variables.
Potential Workarounds
If you want to mimic the UI behavior in Terraform, here are some approaches:
1. Explicitly Define Environment Variables in Cluster Configuration
As mentioned earlier, explicitly set `spark_env_vars` in your cluster definition based on your policy. While this requires additional effort, it ensures consistency with your policy.
2. Use a Script or Module to Automate Variable Propagation
Create a Terraform module or script that dynamically reads the policy definition (e.g., via Databricks API) and applies its fixed values to clusters. This approach requires custom scripting but can automate variable propagation.
3. Raise an Issue with Terraform Provider
If this behavior is critical for your workflow, consider raising an issue on the [Databricks Terraform provider GitHub repository](https://github.com/databricks/terraform-provider-databricks). This could help bring attention to the limitation and potentially lead to improvements in future releases.
Summary
The discrepancy between UI and Terraform arises because the Terraform provider does not automatically propagate nested attributes like `spark_env_vars` from policies to clusters. While this is frustrating, explicitly defining these variables in your cluster configuration is currently required when using Terraform. For now, leveraging automation or scripting can help bridge this gap until the provider's functionality improves.