cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset Bundles cannot run job with single node job cluster

Volker
New Contributor III

Hello community,

we are deploying a job using asset bundles and the job should run on a single node job cluster. Here is the DAB job definition:

resources:
  jobs:
    example_job:
      name: example_job
      tasks:
        - task_key: main_task
          job_cluster_key: ${var.job_cluster_prefix}
          python_wheel_task:
            package_name: example_package
            entry_point: entrypoint
            named_parameters: 
              config-path: "/Workspace${workspace.file_path}/config/app.conf"
              environment: "${bundle.target}"
          libraries:
            - whl: ./dist/*.whl
      job_clusters:
        - job_cluster_key: ${var.job_cluster_prefix}
          new_cluster:
            spark_version: 13.3.x-scala2.12
            node_type_id: m4.2xlarge
            num_workers: 0
            aws_attributes:
              first_on_demand: 0
              availability: SPOT_WITH_FALLBACK
              zone_id: auto
              spot_bid_price_percent: 100
              ebs_volume_type: GENERAL_PURPOSE_SSD
              ebs_volume_count: 1
              ebs_volume_size: 100
      tags:
        costs/environment: ${bundle.target}
        costs/stage: ${var.costs_stage}
        service: ${var.service}
        domain: ${var.domain}
        owner: ${var.domain}

Since yesterday we are facing the problem that the cluster spins up but does not run the code. Instead the following warning is printed to the Log4J output: "WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources"

The problem does not occurr when adding 1 worker node or when I edit the job cluster via the UI. So I guess, there is an issue with the asset bundles.

I checked the releases of the CLI and for the newest release "v.0.221.1" it states in the release notes: "This releases fixes an issue introduced in v0.221.0 where managing jobs with a single-node cluster would fail."

Also strangely: Locally I have version "v.0.218.0" of the CLI installed and when I deploy the job locally the code runs until some intended exception but instead of failing the job, the job keeps on running and the same message: "WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources" gets written to the Log4J output.

Did anybody else experience this issue and knows how to solve that?

4 REPLIES 4

kunalmishra9
New Contributor III

Following. Having a similar issue in which setting num_workers to 0 doesn't work. When I deploy the bundle:

Error: cannot update job: NumWorkers could be 0 only for SingleNode clusters.

- job_cluster_key: ${bundle.name}

new_cluster:

cluster_name: ""

spark_version: 15.4.x-scala2.12

azure_attributes:

first_on_demand: 1

availability: ON_DEMAND_AZURE

spot_bid_max_price: -1

node_type_id: Standard_E32d_v4 # Worker Node: 32 cores & 256GB RAM

# driver_node_type_id: Standard_E

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

policy_id: ${var.ds_compute_policy_id}

data_security_mode: SINGLE_USER

runtime_engine: PHOTON

num_workers: 0

# autoscale:

# min_workers: 1

# max_workers: 2

AlbertWang
Contributor III

@Volker @kunalmishra9 As answered by Ganesh Chandrasekaran , the below solution works. I tested. Btw, you need to keep both spark_conf and custom_tags.

 

      job_clusters:
        - job_cluster_key: job_cluster
          new_cluster:
            spark_version: 15.4.x-scala2.12
            node_type_id: Standard_D4ads_v5
            num_workers: 0
            spark_conf:
                "spark.databricks.cluster.profile": "singleNode"
                "spark.master": "local[*, 4]"
            custom_tags:
                "ResourceClass": "SingleNode"

AlbertWang
Contributor III

@Volker @kunalmishra9  Tested the below element is not necessary.
num_workers: 0

 

Volker
New Contributor III

Sorry for the late reply, this helped, thank you! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group