3 weeks ago
Good Day all,
After having issues with Cloud resources allocated to Lakeflow jobs and Gateways I am trying to apply a policy to the cluster that is allocated to the Job. I am very new to a lot of the databricks platform and the administration so all help is appreciated.
I have run the following command:
databricks clusters edit clusterid 16.4.x-scala2.13 --apply-policy-default-values --policy-id policyid --num-workers 1 -p adamI am now getting the following error:
Error: NO_ISOLATION or custom access modes are not allowed in this workspace. Please contact your workspace administrator to use this feature.I have looked through the account and workspace settings and can't see where I can change this, I have also done a search and it looked like it is editable on the cluster but I can't edit the cluster for the created pipeline.
Is there a problem with my cli command or where do I need to make the correct change to let me apply a compute policy to the DLT compute?
3 weeks ago
@Adam_Borlase , Thanks, this is helpful context. The key is that the SQL Server connector’s ingestion pipeline runs on serverless, while the ingestion “gateway” runs on classic compute in your cloud account, so vCPU family quotas can block gateway creation unless you control instance types at creation time with a compute policy and/or the API.
bash
databricks pipelines create --json '{
"name": "sqlserver-gateway",
"gateway_definition": {
"connection_id": "<CONNECTION_ID>",
"gateway_storage_catalog": "main",
"gateway_storage_schema": "sqlserver01",
"gateway_storage_name": "sqlserver01-gateway"
},
"clusters": [{
"label": "default",
"policy_id": "<POLICY_ID>",
"apply_policy_default_values": true
}]
}'
bash
databricks pipelines create --json '{
"name": "sqlserver-ingestion-pipeline",
"ingestion_definition": {
"ingestion_gateway_id": "<GATEWAY_PIPELINE_ID>",
"objects": [
{ "schema": {
"source_catalog": "sqlserver01",
"source_schema": "dbo",
"destination_catalog": "main",
"destination_schema": "sqlserver01"
}}
]
}
}'
3 weeks ago
Hey @Adam_Borlase , Thanks for sharing the command and error—this is a common pitfall when trying to control Lakeflow (DLT) compute with cluster policies.
json
{
"clusters": [
{
"label": "default",
"policy_id": "<policy-id>",
"apply_policy_default_values": true
}
]
}
In the policy itself, include "cluster_type": { "type": "fixed", "value": "dlt" } so it is selectable for pipelines.json
"spark_conf.spark.databricks.cluster.profile": {
"type": "forbidden",
"hidden": true
}
3 weeks ago
Good Afternoon Louis,
Thank you for the detailed answer. The issue I face is that the default gateway is allocating Virtual CPUs which is not in our Quotas so I need to apply the Compute policy at the creation stage. At this point in the pipelines I can see the settings Yaml but have no option to edit the Pipeline (See the attached image) as it stands prior to it completing the setup of a new Lake flow connect on the SQL Server.
I have also tried to update the Pipeline definition as mentioned above and I am getting other failures. What would be the best way to set up a new data ingestion pipeline that applies our Compute policy to ensure we are only using CPUs that are allocated to us?
Do we need to contact our infrastructure time to increase our Quotas or is there a way to control at the creation stage of a new SQL server data ingestion the type of compute it is using? This is the first one we are setting up so very inexperienced with the issues we are facing.
3 weeks ago
@Adam_Borlase , Thanks, this is helpful context. The key is that the SQL Server connector’s ingestion pipeline runs on serverless, while the ingestion “gateway” runs on classic compute in your cloud account, so vCPU family quotas can block gateway creation unless you control instance types at creation time with a compute policy and/or the API.
bash
databricks pipelines create --json '{
"name": "sqlserver-gateway",
"gateway_definition": {
"connection_id": "<CONNECTION_ID>",
"gateway_storage_catalog": "main",
"gateway_storage_schema": "sqlserver01",
"gateway_storage_name": "sqlserver01-gateway"
},
"clusters": [{
"label": "default",
"policy_id": "<POLICY_ID>",
"apply_policy_default_values": true
}]
}'
bash
databricks pipelines create --json '{
"name": "sqlserver-ingestion-pipeline",
"ingestion_definition": {
"ingestion_gateway_id": "<GATEWAY_PIPELINE_ID>",
"objects": [
{ "schema": {
"source_catalog": "sqlserver01",
"source_schema": "dbo",
"destination_catalog": "main",
"destination_schema": "sqlserver01"
}}
]
}
}'
3 weeks ago
Thank you so much Louis,
This has resolved all of our issues! Really appreciate the help.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now