Databricks Community

ankit001mittal · ‎06-04-2025

Hi,
I am trying to define a policy for our DLT pipelines and I would like to provide a specific spark version like in the below example:

{
  "spark_conf.spark.databricks.cluster.profile": {
    "type": "forbidden",
    "hidden": true
  },
  "spark_version": {
    "type": "allowlist",
    "values": [
      "14.3.x-scala2.12"
    ]
  },
  "node_type_id": {
    "type": "unlimited",
    "defaultValue": "Standard_DS3_v2",
    "isOptional": true
  },
  "num_workers": {
    "type": "unlimited",
    "defaultValue": 4,
    "isOptional": true
  },
  "azure_attributes.availability": {
    "type": "unlimited",
    "defaultValue": "SPOT_WITH_FALLBACK_AZURE"
  },
  "azure_attributes.spot_bid_max_price": {
    "type": "fixed",
    "value": 100,
    "hidden": true
  },
  "instance_pool_id": {
    "type": "forbidden",
    "hidden": true
  },
  "driver_instance_pool_id": {
    "type": "forbidden",
    "hidden": true
  },
  "cluster_type": {
    "type": "fixed",
    "value": "dlt"
  }
}

But I am getting this error in my pipeline:

INVALID_PARAMETER_VALUE: [DLT ERROR CODE: INVALID_CLUSTER_SETTING.CLIENT_ERROR] The cluster policy specified in the pipeline settings is not compatible with Delta Live Tables. Remove 'spark_version’ from your cluster policy.

Could you please help me with it?

lingareddy_Alva · ‎06-04-2025

Hi @ankit001mittal

The error you're encountering is because Delta Live Tables (DLT) has specific requirements and automatically manages certain cluster configurations, including the Spark version. DLT pipelines are designed to use optimized Spark versions that are compatible with the DLT runtime, and allowing users to specify custom Spark versions can lead to compatibility issues.
Here's how to fix your cluster policy for DLT pipelines:

Remove the spark_version constraint from your policy:

{
"spark_conf.spark.databricks.cluster.profile": {
"type": "forbidden",
"hidden": true
},
"node_type_id": {
"type": "unlimited",
"defaultValue": "Standard_DS3_v2",
"isOptional": true
},
"num_workers": {
"type": "unlimited",
"defaultValue": 4,
"isOptional": true
},
"azure_attributes.availability": {
"type": "unlimited",
"defaultValue": "SPOT_WITH_FALLBACK_AZURE"
},
"azure_attributes.spot_bid_max_price": {
"type": "fixed",
"value": 100,
"hidden": true
},
"instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"driver_instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"cluster_type": {
"type": "fixed",
"value": "dlt"
}
}

Why this happens:
1. DLT Runtime Management: DLT automatically selects and manages the appropriate Spark version based on the DLT runtime version and channel (current/preview) you're using
2. Compatibility: DLT includes specific optimizations and features that require particular Spark versions
3. Automatic Updates: DLT handles Spark version updates as part of its managed service approach

Alternative approaches if you need version control:
1. Use DLT Runtime Channels: Instead of specifying Spark versions, you can control which DLT runtime channel your pipeline uses (current vs preview) in the pipeline configuration
2. Separate Policies: Consider having separate cluster policies - one for DLT pipelines (without spark_version) and another for regular clusters (with spark_version constraints)
3. Pipeline-Level Configuration: Set any specific runtime requirements at the pipeline level rather than the cluster policy level

LR

Databricks Community

Policy for DLT

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples