yesterday - last edited yesterday
Dear Community,
I’m seeing a compute error when running a Databricks ingestion pipeline (Lakeflow managed ingestion) on AWS.
Cloud : AWS
Region: ap−northeast−2
Source: SQL Server ingestion pipeline
When I start the ingestion pipeline, it fails with the following error:
I found that the job cluster on the page below is using an inappropriate instance type, and I currently suspect this is the cause of the issue.
Could you help check why the ingestion pipeline cannot start in this workspace/region, and what configuration change or workaround you recommend?
Thank you.
yesterday
@kyeongmin_baek - I 'suspect' it is due to either because the instance type is not available in ap-northeast-2, or there’s temporary capacity exhaustion. This is common with On-Demand instances in less common regions or with large instance types.
To fix it, try changing the instance type in your pipeline configuration.
yesterday
Hi, as Raman says it is probably that the cluster type is not currently available. In order to see which ones are available to update the config you can run the following:
from databricks.sdk import WorkspaceClient
import json
def list_node_types():
w = WorkspaceClient()
response = w.clusters.list_node_types()
return response.node_types
def filter_node_types(min_cores, max_cores, min_memory, max_memory):
node_types = list_node_types()
filtered_node_types = [
node_type for node_type in node_types if
node_type.num_cores >= min_cores and
node_type.num_cores <= max_cores and
min_memory <= node_type.memory_mb <= max_memory
]
return filtered_node_types
def node_type_to_dict(node_type):
return {
'node_type_id': node_type.node_type_id,
'num_cores': node_type.num_cores,
'memory_mb': node_type.memory_mb,
'description': node_type.description
}
filtered_node_types = filter_node_types(4, 4, 8192, 16384)
filtered_node_types_dicts = [node_type_to_dict(node_type) for node_type in filtered_node_types]
display(json.dumps(filtered_node_types_dicts))
The default for the SQL server connect gateway on AWS is r5.xlarge.
You can set the driver type in the pipeline config -
gateway_pipeline_spec = {
"pipeline_type": "INGESTION_GATEWAY",
"name": gateway_pipeline_name,
"gateway_definition": gateway_def.as_dict(),
"clusters":{
"driver_node_type_id": "r5.xlarge"
}
}
I hope this helps
yesterday
Please go to AWS service quotas and increase it https://console.aws.amazon.com/servicequotas/home
yesterday
Thank you for your response.
I have an additional question.
When creating a SQL Server ingestion pipeline using the Databricks Connector, is it possible to edit the compute instance type settings?
I am currently configuring this in the Databricks UI, but it seems that this option is not shown.
21 hours ago
Hi, I'm afraid you cannot edit compute instance type settings for SQL Server ingestion pipelines via the Databricks UI. Such changes can only be made via API.
21 hours ago
not through UI but can be done with DABS
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now