cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem when updating Databricks Repo through DevOps Pipeline

SJR
New Contributor III

Hello all!

I've been working on integrating a Databricks Repos update API call to a DevOps Pipeline so that the Databricks local repo stays up to date with the remote staging branch (Pipeline executes whenever there's a new commit in to the staging branch). Everything seems to work fine (request returns 200), but for some reason, after the pipeline executes, the Databricks local repo has yet to perform the git pull operation. 
I tried checking out to main and then staging to see if it would work, but it didn't change anything.

The weird thing is that if I run the same code using Postman or python, the repo gets updated as expected.

Here's the pipeline's code:

 

task: PythonScript@0
  inputs:
    scriptSource: 'inline'
    script: |
      import requests as rq
      import os
      access_token = os.environ["DATABRICKS_TOKEN_GENERATOR_ACCESS_TOKEN"]
      url = "$(databrick-hostname)/api/2.0/repos/$(repo-id)"
      headers = {'Authentication': f'Bearer {access_token}'}
      data = '{"branch": "main"}'
      response = rq.patch(url, headers=headers, data = data)
      data = '{"branch": "staging"}'
      response = rq.patch(url, headers=headers, data = data)
      print(response)
  displayName: 'Update Staging project'

 


Any clues as to where I might be messing up? 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions

SJR
New Contributor III

@BookerE1 I found it!. There was already another thread related to this problem and someone else helped me find the solution (Problem was the pool that I was using for the pipeline)

This is the link to the other thread: https://community.databricks.com/t5/get-started-discussions/getting-html-sign-i-page-as-api-response...

Thanks for all your help and goodwill. Have a good one!

View solution in original post

4 REPLIES 4

BookerE1
New Contributor III

Hello,

I can try to give you some possible solutions for your problem, based on the web search results that I found. Here are some suggestions:

  • Make sure that your Databricks workspace and your Azure DevOps account are properly connected and authenticated.
  • Check if your Databricks Repos update API call is using the correct parameters and values. You can refer to the Databricks Repos API documentation for more details. In particular, make sure that the repo-id parameter matches the ID of the repository that you want to update, and the branch parameter matches the name of the branch that you want to pull from.
  • Try adding a delay or a retry mechanism in your pipeline code, in case the Databricks Repos update API call takes some time to complete or fails intermittently. You can use the time module in Python to add a delay, or the requests module to handle retries. For example, you can modify your code as follows:

import requests as rq
import os
import time

access_token = os.environ["DATABRICKS_TOKEN_GENERATOR_ACCESS_TOKEN"]
url = "$(databrick-hostname)/api/2.0/repos/$(repo-id)"
headers = {'Authentication': f'Bearer {access_token}'}

# Add a delay of 10 seconds between each API call
time.sleep(10)

data = '{"branch": "main"}'
response = rq.patch(url, headers=headers, data = data)
print(response)

time.sleep(10)

data = '{"branch": "staging"}'
response = rq.patch(url, headers=headers, data = data)
print(response)

# Add a retry mechanism with 3 attempts and a backoff factor of 1 second
session = rq.Session()
adapter = rq.adapters.HTTPAdapter(max_retries=rq.packages.urllib3.util.retry.Retry(total=3, backoff_factor=1))
session.mount('https://', adapter)
session.mount('http://', adapter)

data = '{"branch": "main"}'
response = session.patch(url, headers=headers, data = data)
print(response)

data = '{"branch": "staging"}'
response = session.patch(url, headers=headers, data = data)
print(response)

 

I hope this helps you to fix your problem and update your Databricks local repo using your DevOps pipeline. If you have any other questions or requests, please let me know.

SJR
New Contributor III

Hello @BookerE1 !

Thanks for your reply! Unfortunately, I tried everything and still nothing. I'm printing the text of the response that I'm getting and it seems it's trying to log in to Databricks to execute the request

<!doctype html>
<html>
 <head>
  <meta charset="utf-8">
  <meta http-equiv="Content-Language" content="en">
  <title>Databricks - Sign In</title>
  <meta name="viewport" content="width=960">
  <link rel="icon" type="image/png" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
  <meta http-equiv="content-type" content="text/html; charset=UTF8">
  <script id="__databricks_react_script"></script>
  <script>window.__DATABRICKS_SAFE_FLAGS__={"databricks.infra.showErrorModalOnFetchError":true,"databricks.fe.infra.useReact18":true,"databricks.fe.infra.useReact18NewAPI":false},window.__DATABRICKS_CONFIG__={"publicPath":{"mlflow":"https://databricks-ui-assets.azureedge.net/","dbsql":"https://databricks-ui-assets.azureedge.net/","feature-store":"https://databricks-ui-assets.azureedge.net/","monolith":"https://databricks-ui-assets.azureedge.net/","jaws":"https://databricks-ui-assets.azureedge.net/"}}</script>
  <link rel="icon" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
  <script>
  function setNoCdnAndReload() {
      document.cookie = `x-databricks-cdn-inaccessible=true; path=/; max-age=86400`;
      const metric = 'cdnFallbackOccurred';
      const browserUserAgent = navigator.userAgent;
      const browserTabId = window.browserTabId;
      const performanceEntry = performance.getEntriesByType('resource').filter(e => e.initiatorType === 'script').slice(-1)[0]
      sessionStorage.setItem('databricks-cdn-fallback-telemetry-key', JSON.stringify({ tags: { browserUserAgent, browserTabId }, performanceEntry}));
      window.location.reload();
  }
</script>
  <script>
  // Set a manual timeout for dropped packets to CDN
  function loadScriptWithTimeout(src, timeout) {
     return new Promise((resolve, reject) => {
        const script = document.createElement('script');
          script.defer = true;
          script.src=src;
          script.onload = resolve;
          script.onerror = reject;
          document.head.appendChild(script);
          setTimeout(() => {
              reject(new Error('Script load timeout'));
          }, timeout);
      });
  }
  loadScriptWithTimeout('https://databricks-ui-assets.azureedge.net/static/js/login/login.21e80507.js', 10000).catch(setNoCdnAndReload);
</script>
 </head>
 <body class="light-mode">
  <uses-legacy-bootstrap>
   <div id="login-page"></div>
  </uses-legacy-bootstrap>
 </body>
</html>



BookerE1
New Contributor III

@SJRDogNeedsBest wrote:

Hello @BookerE1 !

Thanks for your reply! Unfortunately, I tried everything and still nothing. I'm printing the text of the response that I'm getting and it seems it's trying to log in to Databricks to execute the request

 

<!doctype html>
<html>
 <head>
  <meta charset="utf-8">
  <meta http-equiv="Content-Language" content="en">
  <title>Databricks - Sign In</title>
  <meta name="viewport" content="width=960">
  <link rel="icon" type="image/png" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
  <meta http-equiv="content-type" content="text/html; charset=UTF8">
  <script id="__databricks_react_script"></script>
  <script>window.__DATABRICKS_SAFE_FLAGS__={"databricks.infra.showErrorModalOnFetchError":true,"databricks.fe.infra.useReact18":true,"databricks.fe.infra.useReact18NewAPI":false},window.__DATABRICKS_CONFIG__={"publicPath":{"mlflow":"https://databricks-ui-assets.azureedge.net/","dbsql":"https://databricks-ui-assets.azureedge.net/","feature-store":"https://databricks-ui-assets.azureedge.net/","monolith":"https://databricks-ui-assets.azureedge.net/","jaws":"https://databricks-ui-assets.azureedge.net/"}}</script>
  <link rel="icon" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
  <script>
  function setNoCdnAndReload() {
      document.cookie = `x-databricks-cdn-inaccessible=true; path=/; max-age=86400`;
      const metric = 'cdnFallbackOccurred';
      const browserUserAgent = navigator.userAgent;
      const browserTabId = window.browserTabId;
      const performanceEntry = performance.getEntriesByType('resource').filter(e => e.initiatorType === 'script').slice(-1)[0]
      sessionStorage.setItem('databricks-cdn-fallback-telemetry-key', JSON.stringify({ tags: { browserUserAgent, browserTabId }, performanceEntry}));
      window.location.reload();
  }
</script>
  <script>
  // Set a manual timeout for dropped packets to CDN
  function loadScriptWithTimeout(src, timeout) {
     return new Promise((resolve, reject) => {
        const script = document.createElement('script');
          script.defer = true;
          script.src=src;
          script.onload = resolve;
          script.onerror = reject;
          document.head.appendChild(script);
          setTimeout(() => {
              reject(new Error('Script load timeout'));
          }, timeout);
      });
  }
  loadScriptWithTimeout('https://databricks-ui-assets.azureedge.net/static/js/login/login.21e80507.js', 10000).catch(setNoCdnAndReload);
</script>
 </head>
 <body class="light-mode">
  <uses-legacy-bootstrap>
   <div id="login-page"></div>
  </uses-legacy-bootstrap>
 </body>
</html>

 




Hello, 

I’m sorry to hear that you are having trouble logging in to Databricks.

Please Try the below step for me. I think it's helpful for you.

  • Clear your browser cache and cookies: Sometimes, your browser may store outdated or corrupted data that interferes with the login process. You can try clearing your browser cache and cookies, and then try logging in again. You can follow the instructions here to clear your browser data.
  • Check your email address and password: Make sure that you are using the correct email address and password that you registered with. If you forgot your password, you can request a password reset link by clicking on the “Forgot Password?” link on the login page. You should receive an email with a link to reset your password. If you don’t see the email, check your spam or junk folder, or try a different email address.
  • Log out of Microsoft Entra ID: If you are using a Microsoft account to log in to Databricks, you may encounter an error if you have not enabled multi-factor authentication (MFA) for your account. To resolve this problem, you must log out of Microsoft Entra ID by going to portal.azure.com and logging out. When you log back in, you should get the prompt to use MFA to log in. If that does not work, try logging out completely from all Azure services before attempting to log in again.

I hope this helps you resolve your login issue. If you have any other questions or feedback, please let me know. I’m always happy to help.

 

 

Best Regard,
BookerE1

SJR
New Contributor III

@BookerE1 I found it!. There was already another thread related to this problem and someone else helped me find the solution (Problem was the pool that I was using for the pipeline)

This is the link to the other thread: https://community.databricks.com/t5/get-started-discussions/getting-html-sign-i-page-as-api-response...

Thanks for all your help and goodwill. Have a good one!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.