cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Rest API invocation for databricks notebook fails while invoking from ADF pipeline

sparkstreaming
New Contributor III

In the current implementation a streaming databricks notebook needs to be started based on the configuration passed. Since the rest of databricks notebooks are being invoked by using ADF,it was decided to use ADF for starting these notebooks. Since there are followup activites that needs to be done after the notebook starts, we tried to start the streaming notebook from an ADF pipeline VIA Rest API.The ADF WEB component is being leveraged to call the rest api for running a notebook. However while trying to invoke the notebook we are getting a Malformed URL Exception :-

{"error_code":"MALFORMED_REQUEST","message":"Invalid JSON given in the body of the request - failed to parse given JSON"}

The below JSON Snippet is being passed as the body for the request

{
  "tasks": [
    {
      "task_key": "Match",
      "description": "Matches orders with user sessions",
      "notebook_task": {
        "notebook_path": "/Users/userxxx@xxxxsandbox.com/Demo/RealTimeXXXXXXX",
      },
      "timeout_seconds": 86400
    }
  ],
  "run_name": "A multitask job run",
  "git_source": null,
  "timeout_seconds": 86400
}

Can somebody guide me on what am I doing wrong here

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

@Prasanth KP​ ,

  • remove comma after XXXXXXX",
  • remove "git_source": null
  • in ADF, I had to wrap everything in single quotes
  • Please test your API call with curl or postman first.

'{
  "tasks": [
    {
      "task_key": "Match",
      "description": "Matches orders with user sessions",
      "notebook_task": {
        "notebook_path": "/Users/userxxx@xxxxsandbox.com/Demo/RealTimeXXXXXXX"
      },
      "timeout_seconds": 86400
    }
  ],
  "run_name": "A multitask job run",
  "timeout_seconds": 86400
}'

View solution in original post

4 REPLIES 4

Hubert-Dudek
Esteemed Contributor III

@Prasanth KP​ ,

  • remove comma after XXXXXXX",
  • remove "git_source": null
  • in ADF, I had to wrap everything in single quotes
  • Please test your API call with curl or postman first.

'{
  "tasks": [
    {
      "task_key": "Match",
      "description": "Matches orders with user sessions",
      "notebook_task": {
        "notebook_path": "/Users/userxxx@xxxxsandbox.com/Demo/RealTimeXXXXXXX"
      },
      "timeout_seconds": 86400
    }
  ],
  "run_name": "A multitask job run",
  "timeout_seconds": 86400
}'

Thanks Hubert. This worked.

-werners-
Esteemed Contributor III

@Prasanth KP​ ,

clearly, the rest call is invalid. What endpoint do you call?

Also do not forget to authenticate.

May I ask why you use the REST API instead of the available notebook functionality of ADF?

Thank you Kaniz. The solution suggested by Hubert worked.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group