Re: Can't pass dynamic parameters to non-notebook ...

zed · ‎11-05-2024

I need to access the date of a given job running as a non-notebook Python job (spark_python_task). I want to pass a value from the cli when running it and being available to access the value in the script

I tried the approaches in the attached image when running

bundle run my_job --params run_date=20240101

Walter_C · ‎11-05-2024

Job parameters are automatically pushed down as key-value parameters to all tasks that accept key-value parameters, which include the following task types:

Notebook
Python wheel (only when configured with keyword arguments)
SQL query, legacy dashboard, or file
Run Job

zed · ‎11-05-2024

Hi, thank you for your response. I have a few follow-up questions to clarify best practices when it comes to passing parameters with Python files:

If I want to pass parameters, should I avoid using spark_python_task for Python scripts?
In the context of using Databricks Asset Bundles, is it generally discouraged to submit Databricks jobs using Python files (vs. notebooks)?
I was able to pass parameters with --python-params like --run_date 20240101 and then load them using argument parser. Is it accurate to say that spark_python_task does not support key-value parameter passing, and if so, what would you recommend if I want to maintain my project in Python files rather than notebooks while being able to pass parameters?

Thank you for your help with this!

zed · ‎11-05-2024

I come with 1 more question. To clarify, my previous questions were focused on jobs that are triggered manually, without scheduling.

For scheduled jobs—particularly those using Python script tasks—how can I configure the `job.yml` resource and the Python script to dynamically retrieve `{{job.start_time.iso_date}}` at runtime?

Thanks again!

Walter_C · ‎11-05-2024

Can you confirm if this solution applies to you https://community.databricks.com/t5/data-engineering/retrieve-job-level-parameters-in-spark-python-t... ?

zed · ‎11-05-2024

So, I think if I change the spark_python_task to a notebook_task but I keep the file as python file instead of notebook is ok. Now I can use the data bricks widgets easily and retrieve those parameters and I also put to version control python files instead of notebook

Can't pass dynamic parameters to non-notebook Python job (spark_python_task)