cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Many dbutils.notebook.run interations in a workflow -> Failed to checkout Github repository Error

Michael_Galli
Contributor II

Hi all,

I have a workflow that runs one single notebook with dbutils.notebook.run() and different parameters in one long loop.
At some point, I do have random git erros in the notebook run:

com.databricks.WorkflowException: com.databricks.NotebookExecutionException: FAILED: Failed to checkout Git repository: UNAVAILABLE

If I run the workflow again, it might work, or fail at another stage.
Seems that I hit some kind of GitHub API limit in the workspace..
Is there any way or workaround so solve this?

 

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @Michael_Galli, It appears that you’re encountering GitHub-related issues during your notebook runs in Databricks. 

 

Let’s address this step by step:

 

GitHub API Limit:

  • Databricks enforces rate limits for all REST API calls, including those related to Git integration.
  • These limits are set per endpoint and per workspace to ensure fair usage and high availability.
  • You’ll receive a 429 response status code if your requests exceed the rate limit.
  • To mitigate this, consider optimizing your API calls or spreading them out over time.
  • You can find more details about rate limits in the Databricks REST API reference.

Workspace Repositories:

Workaround Suggestions:

  • Here are some potential workarounds:
    • Retry Mechanism: Implement a retry mechanism in your workflow. If a Git error occurs, wait briefly and then retry the operation.
    • Throttle Requests: Introduce a delay between consecutive Git-related API calls to avoid hitting rate limits.
    • Error Handling: Catch Git-related exceptions and handle them gracefully. You can log the errors, retry, or take alternative actions.
    • Optimize Git Operations: Review your notebook code and identify any unnecessary or redundant Git operations. Minimize the number of Git-related actions if possible.

Remember that Git-related issues can be tricky, but with careful handling and optimization, you can improve the reliability of your workflow. 

 

Good luck! 🚀

View solution in original post

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Michael_Galli, It appears that you’re encountering GitHub-related issues during your notebook runs in Databricks. 

 

Let’s address this step by step:

 

GitHub API Limit:

  • Databricks enforces rate limits for all REST API calls, including those related to Git integration.
  • These limits are set per endpoint and per workspace to ensure fair usage and high availability.
  • You’ll receive a 429 response status code if your requests exceed the rate limit.
  • To mitigate this, consider optimizing your API calls or spreading them out over time.
  • You can find more details about rate limits in the Databricks REST API reference.

Workspace Repositories:

Workaround Suggestions:

  • Here are some potential workarounds:
    • Retry Mechanism: Implement a retry mechanism in your workflow. If a Git error occurs, wait briefly and then retry the operation.
    • Throttle Requests: Introduce a delay between consecutive Git-related API calls to avoid hitting rate limits.
    • Error Handling: Catch Git-related exceptions and handle them gracefully. You can log the errors, retry, or take alternative actions.
    • Optimize Git Operations: Review your notebook code and identify any unnecessary or redundant Git operations. Minimize the number of Git-related actions if possible.

Remember that Git-related issues can be tricky, but with careful handling and optimization, you can improve the reliability of your workflow. 

 

Good luck! 🚀

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.