Hi @Raghav1,
There are a few things happening here, so let me walk through each one and give you a path forward.
UNDERSTANDING THE GATEWAY_DEFINITION ERROR
The error "Modifying following parameter gateway_definition in pipeline settings is not allowed" occurs because certain pipeline parameters are treated as immutable once the pipeline is created. The gateway_definition is an internal field tied to how the pipeline's compute is initially provisioned, and it cannot be changed via a CLI update command after creation.
The recommended approach is to delete the pipeline and recreate it with the correct compute configuration from the start.
HOW TO PROPERLY ATTACH A CLUSTER POLICY TO A PIPELINE
When creating a Lakeflow Spark Declarative Pipeline (SDP) via the Databricks CLI or REST API, you need to include the policy_id in the clusters array of your pipeline definition JSON, along with apply_policy_default_values set to true. This ensures the policy defaults are applied to the pipeline compute.
Here is an example pipeline creation JSON that references a cluster policy:
{
"name": "my-sdp-pipeline",
"clusters": [
{
"label": "default",
"policy_id": "<your-cluster-policy-id>",
"apply_policy_default_values": true
}
],
"libraries": [
{
"notebook": {
"path": "/path/to/your/notebook"
}
}
],
"development": true
}
The "label": "default" ensures the policy applies to both the update cluster and the maintenance cluster that every pipeline creates.
To create this pipeline via the Databricks CLI:
databricks pipelines create --json '{
"name": "my-sdp-pipeline",
"clusters": [
{
"label": "default",
"policy_id": "<your-cluster-policy-id>",
"apply_policy_default_values": true
}
],
"libraries": [
{
"notebook": {
"path": "/path/to/your/notebook"
}
}
],
"development": true
}'
IMPORTANT: POLICIES CONSTRAIN, THEY DO NOT AUTO-DOWNSIZE
A cluster policy constrains what values are allowed for compute configuration. It does not automatically select the smallest possible VM for you. If your pipeline definition does not explicitly set a node_type_id, Databricks will auto-select one, and that auto-selected type must still comply with your policy. If the auto-selected type happens to be Standard_F4s and that exceeds your Azure Free Trial quota, the cluster will fail to launch.
To ensure a small enough VM is used, you should either:
1. Set a fixed node_type_id directly in the pipeline's clusters configuration:
{
"clusters": [
{
"label": "default",
"policy_id": "<your-policy-id>",
"apply_policy_default_values": true,
"node_type_id": "Standard_DS3_v2",
"driver_node_type_id": "Standard_DS3_v2",
"num_workers": 0
}
]
}
2. Or set a fixed value in the cluster policy itself so that any pipeline using it always gets a small VM:
{
"cluster_type": {
"type": "fixed",
"value": "dlt"
},
"node_type_id": {
"type": "fixed",
"value": "Standard_DS3_v2",
"hidden": true
},
"driver_node_type_id": {
"type": "fixed",
"value": "Standard_DS3_v2",
"hidden": true
},
"num_workers": {
"type": "fixed",
"value": 0,
"hidden": true
}
}
Note: For Lakeflow Spark Declarative Pipelines (SDP) policies, you must include "cluster_type" set to "dlt" in the policy definition.
Setting num_workers to 0 gives you a single-node (driver-only) compute resource, which is the most quota-friendly option for an Azure Free Trial.
RECOMMENDED STEPS FOR YOUR SITUATION
1. Delete the existing pipeline that is stuck with the wrong configuration.
2. Create a cluster policy (if you have not already) with fixed small VM types and cluster_type set to "dlt". You can verify your available VM sizes in the Azure portal to find one that fits within your quota.
3. Recreate the pipeline using the CLI command above, referencing your policy_id and setting apply_policy_default_values to true.
4. If you want to be explicit and not rely on the policy alone, also specify node_type_id and num_workers: 0 in the clusters block of the pipeline JSON.
RELEVANT DOCUMENTATION
- Configure classic compute for Lakeflow Spark Declarative Pipelines: https://docs.databricks.com/aws/en/delta-live-tables/configure-compute.html
- Pipeline settings reference: https://docs.databricks.com/aws/en/delta-live-tables/settings.html
- Compute policy definition reference: https://docs.databricks.com/aws/en/admin/clusters/policy-definition.html
- Pipelines CLI reference: https://docs.databricks.com/aws/en/dev-tools/cli/databricks-cli.html
Since you are on Azure, replace "aws" in the documentation URLs above with "azure" to see the Azure-specific versions.
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.