cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks sql variables and if/else workflow

JKR
New Contributor III

I have 2 tasks in databricks job workflow first task is of type SQL and SQL task is query.In that query I've declared 2 variables and SET the values by running query.e.g:

DECLARE VARIABLE max_timestamp TIMESTAMP DEFAULT '1970-01-01';
SET VARIABLE max_timestamp = (select max(timestamp) from Table);
select max_timestamp ;

Can I pass this max_timestamp variable to the next if/else task to check if it is null or is there any value in this variable and pass that value of max_timestamp to another task. or is there any workaround to do this? I want to use SQL warehouse for this.

I created a SQL task and assign max timestamp value from subquery to the SQL variable and also created if/else task and in condition I've added {{tasks. [task_name]. values.[value_name]}}, but the job throw below error on the if/else task.

run failed with error message

Failed to resolve references: Task value referenced by 'tasks.task_name.values.max_timestamp' was not found.

Query based dropdown list option in databricks sql editor will execute the query multiple times if I use the variable at multiple places in a big sql query ? If yes then it will be executing query to max_timestamp everytime I refer in the query.

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @JKR, You can indeed pass variables between tasks in a Databricks job workflow. This is done using the taskValues subutility in Databricks Utilities. This utility allows tasks to output values that can be referenced in subsequent tasks.

 

However, it’s important to note that the error message you’re seeing, Failed to resolve references: Task value referenced by 'tasks.task_name.values.max_timestamp' was not found, suggests that the max_timestamp value was not set correctly in the SQL task, or there might be a mismatch in the task name or value name.

 

Regarding your question about the query-based dropdown list option in Databricks SQL editor, it appe.... If performance is a concern due to the repeated execution of the query, you might want to consider alternative strategies for managing your data.

 

Please note that the use of SQL variables like DECLARE and SET is only supported in Databricks runti.... If you’re using SQL Data Warehouse, you might need to use a Databricks cluster with a SQL notebook4. Also, the {{tasks.[task_name].values.[value_name]}} syntax is supported only in Python notebooks.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.