cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to access the job-Scheduling Date from within the notebook?

karolinalbinsso
New Contributor II

I have created a job that contains a notebook that reads a file from Azure Storage.

The file-name contains the date of when the file was transferred to the storage. A new file arrives every Monday, and the read-job is scheduled to run every Monday.

In my notebook, I want to use the schedule-date of the job to read the file from Azure Storage with the same date in the filename, something like this:

file_location = ("file_name+"_"+job_date+_+country_id+.csv")

I have tried to pass a date as a parameter and I am able to access that from the notebook, but if the job fails and I want to re-run the job the next day, I'd have to manually enter yesterdays date as the input parameter. I want to avoid this and just use the real scheduling date for the job.

How do I access the job scheduling date from within the notebook?

Thanks in advance

Karolin

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

Hi, I guess the files are in the same directory structure so that you can use cloud files autoloader. It will incrementally read only new files https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/auto-loader

So it will be another way around, so you can take the date from the input file using.:

.withColumn("filePath",input_file_name())

View solution in original post

2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

Hi, I guess the files are in the same directory structure so that you can use cloud files autoloader. It will incrementally read only new files https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/auto-loader

So it will be another way around, so you can take the date from the input file using.:

.withColumn("filePath",input_file_name())

@Kani Yes...I have similar use case where i run a sql query with filter start date and end_date and job has to run in every 10days 

Current run > select * from table where start_date > 01-01-2024 and end_date < 01-10-24  

Now if job is succsful in nect run it should pick > Select * from table where start_date > 01-10-24 and end_date < 01-20-24

Workflow should automatically take these dates on execution

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group