cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Is there a way to discover in the next task if the previous for loop task has some...

jeremy98
Honored Contributor

Hi community,

As the title suggests, I'm looking for a smart way to determine which runs in a for-loop task succeeded and which didnโ€™t, so I can use that information in the next task.

Summary:

I have a for-loop task that runs multiple items (e.g., run1, run2, run3). Suppose only two of them succeed. In the next task, I want to process only the successful runs (e.g., run2).

6 REPLIES 6

szymon_dybczak
Esteemed Contributor III

Hi @jeremy98 ,

Maybe you can use Jobs API 2.2 for that scenario and based on returned payload orchestrate your pipeline?

Updating from Jobs API 2.1 to 2.2 | Databricks Documentation

jeremy98
Honored Contributor

Oki, I was thinking about this solution.. thanks!

What about the token to generate to make this api call. We're using the service principal how to generate a sas token everytime we need to make this api call?

szymon_dybczak
Esteemed Contributor III

It depenends a bit of what type of service principal are you using? Service principals can either be Azure Databricks managed service principals or Microsoft Entra ID managed service principals.
Azure Databricks managed service principals can authenticate to Azure Databricks using Databricks OAuth authentication and personal access tokens. Microsoft Entra ID managed service principals can authenticate to Azure Databricks using Databricks OAuth authentication and Microsoft Entra ID tokens.

 

jeremy98
Honored Contributor

We have: Microsoft Entra ID managed service principals. That then we link them on databricks

SebastianRowan
Contributor

Easiest way is to log each loopโ€™s status with `dbutils.jobs.taskValues.set` then just grab those in the next task and only work with the ones that passed.

Hi, thanks for your answer! But this couldn't cause some problems on API limits? Because sometimes we are iterating over 100 inputs.

And how do you set the key and value? And how to retrieve them in the next task in batch?