cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Passing UNIX-based parameter to a task

Garrus990
New Contributor

Hey,

I would like to pass to a task a parameter that is based on a UNIX function. Concretely, I would like to specify dates - dynamically calculated with respect to the date of running my job. I wanted to it like that:

["--period-start", "$(date -d '-7 days' +'%Y-%m-%d')", "--period-end", "$(date -d '-1 days' +'%Y-%m-%d')"]

It doesn't seem to work, though. Is there any way of achieving it?

1 REPLY 1

NandiniN
Databricks Employee
Databricks Employee

Hi @Garrus990 ,

To pass a parameter to a task that is based on a UNIX function, you can use the Databricks Jobs API to dynamically calculate dates with respect to the date of running your job. 

Use a Notebook to Calculate Dates: Create a notebook that calculates the required dates using Python or Scala. For example, in Python:

%py 
from datetime import datetime, timedelta period_start = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d') period_end = (datetime.now() - timedelta(days=1)).strftime('%Y-%m-%d') dbutils.notebook.exit(f'{{"period_start": "{period_start}", "period_end": "{period_end}"}}')
  1. Run the Notebook as a Job: Configure this notebook as a job in Databricks. This job will calculate the dates and return them as a JSON string.

  2. Use the Output in Subsequent Tasks: In your main job, use the output of the first notebook as input parameters for subsequent tasks. You can achieve this by chaining jobs and passing the output of one job to another.

Here is an example of how you can set up the job configuration:

  • Job 1: Calculate Dates

    • Notebook: calculate_dates
    • Output: {"period_start": "2024-10-25", "period_end": "2024-10-31"}
  • Job 2: Main Job

    • Notebook: main_job
    • Parameters: --period-start {{job1_output.period_start}} --period-end {{job1_output.period_end}}
Thanks!
 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group