cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

how to get schedule information about a job in databricks?

jeremy98
Contributor III

Hi community,

I was reading the Databricks API documentation and I want to get information about one job if this is schedule with the status PAUSED or UNPAUSED. I was watching that there is this api call: https://docs.databricks.com/api/workspace/jobs/get

I tried to make the code for it and testing it:

    def _get_single_job_api_call(self):
        """Get single job information through Databricks API call"""

        headers = {
            'Authorization': f'Bearer {self.access_token}',
            'Content-Type': 'application/json'
        }

        try:
            response = requests.get(
                url = f'{self.workspace_host}/api/2.2/jobs/get', 
                json = {'job_id': self.job_id},
                headers = headers
            )
            response.raise_for_status()  # Raises HTTPError for bad responses (4xx and 5xx)
            return response.json()
        except requests.exceptions.Timeout:
            logger.error("Request timed out.")
        except requests.exceptions.ConnectionError:
            logger.error("Failed to connect to the server.")
        except requests.exceptions.HTTPError as e:
            logger.error(f"HTTP error occurred: {e}")
        except requests.exceptions.RequestException as e:
            logger.error(f"Request failed: {e}")
        except Exception as e:
            logger.error(f"Unexpected error: {e}")
        
        return None


    def is_job_scheduled(self):
        get_job_status = self._get_single_job_api_call()
        schedule_job_status = get_job_status['settings']['schedule']['pause_status']
        
        if not get_job_status:
            logger.error("Failed to retrieve job status.")
            return True  # Safeguard by assuming it's scheduled
        
        try:
            schedule_job_status = get_job_status['settings']['schedule']['pause_status']
            return schedule_job_status != 'UNPAUSED'  # True if PAUSED or any other value, False only if explicitly UNPAUSED
        except KeyError:
            logger.error("Job schedule information is missing. Assuming job is scheduled for safety.")
            return True  # Safeguard by assuming it's scheduled

But, I'm receiving errors like: HTTP error occurred: 403 Client Error: Forbidden for url: <workspace_host>/api/2.2/jobs/get

with this error message inside: 

{"error_code":401,"message":"Credential was not sent or was of an unsupported type for this API."}

how can I solve it, I'm making the api call wrong?

1 ACCEPTED SOLUTION

Accepted Solutions

KaranamS
Contributor II

Hi @jeremy98 , 

It looks like the access token is incorrect or not valid. Can you please verify the following?

1. Validate your access token - if you get 403 forbidden error, your access token is invalid.

curl -X GET "https://<workspace_host>/api/2.2/jobs/get" \
-H "Authorization: Bearer <your_token>" \
-H "Content-Type: application/json" \
-d '{"job_id": <your_job_id>}'

2. Validate your workspace host url 

print(self.workspace_host)

It should be in this format and ensure no trailing slash at the end - https://<databricks-instance>.cloud.databricks.com

Hope this helps!

View solution in original post

2 REPLIES 2

KaranamS
Contributor II

Hi @jeremy98 , 

It looks like the access token is incorrect or not valid. Can you please verify the following?

1. Validate your access token - if you get 403 forbidden error, your access token is invalid.

curl -X GET "https://<workspace_host>/api/2.2/jobs/get" \
-H "Authorization: Bearer <your_token>" \
-H "Content-Type: application/json" \
-d '{"job_id": <your_job_id>}'

2. Validate your workspace host url 

print(self.workspace_host)

It should be in this format and ensure no trailing slash at the end - https://<databricks-instance>.cloud.databricks.com

Hope this helps!

Yes, thanks, the problem was the access_token that is not set in the other environment 😅

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group