cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Rest API invocation for databricks notebook fails while invoking from ADF pipeline

sparkstreaming
New Contributor III

In the current implementation a streaming databricks notebook needs to be started based on the configuration passed. Since the rest of databricks notebooks are being invoked by using ADF,it was decided to use ADF for starting these notebooks. Since there are followup activites that needs to be done after the notebook starts, we tried to start the streaming notebook from an ADF pipeline VIA Rest API.The ADF WEB component is being leveraged to call the rest api for running a notebook. However while trying to invoke the notebook we are getting a Malformed URL Exception :-

{"error_code":"MALFORMED_REQUEST","message":"Invalid JSON given in the body of the request - failed to parse given JSON"}

The below JSON Snippet is being passed as the body for the request

{
  "tasks": [
    {
      "task_key": "Match",
      "description": "Matches orders with user sessions",
      "notebook_task": {
        "notebook_path": "/Users/userxxx@xxxxsandbox.com/Demo/RealTimeXXXXXXX",
      },
      "timeout_seconds": 86400
    }
  ],
  "run_name": "A multitask job run",
  "git_source": null,
  "timeout_seconds": 86400
}

Can somebody guide me on what am I doing wrong here

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

@Prasanth KP​ ,

  • remove comma after XXXXXXX",
  • remove "git_source": null
  • in ADF, I had to wrap everything in single quotes
  • Please test your API call with curl or postman first.

'{
  "tasks": [
    {
      "task_key": "Match",
      "description": "Matches orders with user sessions",
      "notebook_task": {
        "notebook_path": "/Users/userxxx@xxxxsandbox.com/Demo/RealTimeXXXXXXX"
      },
      "timeout_seconds": 86400
    }
  ],
  "run_name": "A multitask job run",
  "timeout_seconds": 86400
}'

View solution in original post

7 REPLIES 7

Hubert-Dudek
Esteemed Contributor III

@Prasanth KP​ ,

  • remove comma after XXXXXXX",
  • remove "git_source": null
  • in ADF, I had to wrap everything in single quotes
  • Please test your API call with curl or postman first.

'{
  "tasks": [
    {
      "task_key": "Match",
      "description": "Matches orders with user sessions",
      "notebook_task": {
        "notebook_path": "/Users/userxxx@xxxxsandbox.com/Demo/RealTimeXXXXXXX"
      },
      "timeout_seconds": 86400
    }
  ],
  "run_name": "A multitask job run",
  "timeout_seconds": 86400
}'

Thanks Hubert. This worked.

Hi @Prasanth KP​ , I'm glad that @Hubert Dudek​ 's suggestions worked. Would you like to mark his answer as the best?

-werners-
Esteemed Contributor III

@Prasanth KP​ ,

clearly, the rest call is invalid. What endpoint do you call?

Also do not forget to authenticate.

May I ask why you use the REST API instead of the available notebook functionality of ADF?

Kaniz
Community Manager
Community Manager

Hi @Prasanth KP​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ and @Werner Stinckens​ 's responses help you to find the solution? Please let us know.

sparkstreaming
New Contributor III
Thank you Kaniz. The solution suggested by Hubert worked.

Hi @Prasanth KP​ , Well, In that case would you like to mark @Hubert Dudek​ 's answer as the best?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.