Databricks Community

dbx_deltaSharin · ‎09-18-2024

Hello,

I have a Databricks notebook that processes data and generates a list of JSON objects called "list_json". Each JSON object contains an item called "time_to_send" (in UTC datetime format). I want to find the best way to send these JSON messages in a POST request within 1 hour before the "time_to_send". What is the best approach to achieve this?

Thank you.

szymon_dybczak · ‎09-18-2024

Hi @dbx_deltaSharin ,

You can write python function that will consume this list_json as argument and send post request for each object inside list. Since you need to send request within an hour you can use python multiprocessing or asyncio library to make it faster.

But it depends of how many objects you have in your list etc

filipniziol · ‎09-18-2024

Hi @dbx_deltaSharin ,

Additionally to @szymon_dybczak , if you're using Azure, you might consider an architecture where, instead of sending the request directly to your API, you send a message to an Azure Queue or Service Bus. Then, an Azure Function with a Queue Trigger can pick up the message and send it to the API. This approach enhances scalability and reliability because Azure Functions can process multiple requests concurrently and scale automatically based on demand. This can be achieved with other cloud providers as they offer similar services.

dbx_deltaSharin · ‎09-18-2024

Hi everyone,

Thank you for your responses to my question.

@szymon_dybczak, if I understood correctly, your suggestion is based on running the Databricks job in continuous mode. However, this might incur significant costs if the cluster is running every hour.

@filipniziol, your proposal seems like a viable solution. I would just like to get a clearer idea of the associated costs to be able to compare the two options.

For clarification, the initial notebook is designed to run once a day to update and compute the JSON list. Another notebook is needed to process this JSON data and handle the post-processing, starting one hour before the "time_to_send."

Databricks Community

Databricks job trigger in specific times

Photos

Join Us as a Local Community Builder!

Exciting Opportunity to Collaborate with Us!

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Share Your Thoughts on Databricks & Get Rewarded!

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April