cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

At least 1 "file_arrival" blocks are required.

lezwon
New Contributor II

Hi folks, I'm trying to set up a databricks asset bundle for a job to load some product data into databricks. This job was created in databricks and loads the data from a location hardcoded into the notebook (for now). It is supposed to run every 3 hours. I made the local YAML files using the `databricks bundle generate job --existing-job-id <job-id>` command.

Then I tried deploying it again to the workspace, expecting a job like [dev <your-username>] <project-name>_job. to show up. `databricks bundle deploy -t dev` . Instead i got the following error:

```

Deploying resources...
Error: terraform apply: exit status 1

Error: Insufficient file_arrival blocks

on bundle.tf.json line 69, in resource.databricks_job.product_autoloader.trigger:
69: },

At least 1 "file_arrival" blocks are required.

```

Im not sure why this happens. It has something to do with `file_arrival` key under trigger in the config. But i don't need this particular setting as the path is hardcoded within the notebook and also the job is scheduled. This is my job yaml file:

```

resources:
jobs:
product_autoloader:
name: product autoloader
job_clusters:
- job_cluster_key: product_autoloader_cluster
new_cluster:
spark_version: 15.4.x-scala2.12
enable_elastic_disk: true
runtime_engine: STANDARD
node_type_id: Standard_D4ads_v5
azure_attributes:
spot_bid_max_price: 100
availability: SPOT_WITH_FALLBACK_AZURE
first_on_demand: 1
num_workers: 4
data_security_mode: USER_ISOLATION
policy_id: 000E1A8BA1F8767B
tasks:
- task_key: product_autoloader
job_cluster_key: product_autoloader_cluster
email_notifications: {}
max_retries: 2
run_if: ALL_SUCCESS
webhook_notifications: {}
notebook_task:
base_parameters:
folder: swproduct-product-updates
notebook_path: ../src/product autoloader.py
source: WORKSPACE
trigger:
pause_status: UNPAUSED
queue:
enabled: true
tags:
"autoloader": ""
email_notifications: {}
max_concurrent_runs: 1
webhook_notifications: {}
```

 Can someone guide me on this? Thanks

 

1 ACCEPTED SOLUTION

Accepted Solutions

Tharani
New Contributor III

I think since it is a scheduled job, you have to explicitly specify a cron-based schedule instead of using file_arrival in trigger section in yaml file.

View solution in original post

2 REPLIES 2

Tharani
New Contributor III

I think since it is a scheduled job, you have to explicitly specify a cron-based schedule instead of using file_arrival in trigger section in yaml file.

lezwon
New Contributor II

Yep. fixed it with 

trigger:
pause_status: UNPAUSED
periodic:
interval: 3
unit: HOURS
Not sure why the incorrect config was generated in the first place.