<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Databricks job trigger in specific times in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90929#M38030</link>
    <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/91173"&gt;@dbx_deltaSharin&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;You can write python function that will consume this list_json as argument and send post request for each object inside list. Since you need to send request within an hour you can use python multiprocessing or asyncio library to make it faster.&amp;nbsp;&lt;/P&gt;&lt;P&gt;But it depends of how many objects you have in your list etc&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 18 Sep 2024 15:13:34 GMT</pubDate>
    <dc:creator>szymon_dybczak</dc:creator>
    <dc:date>2024-09-18T15:13:34Z</dc:date>
    <item>
      <title>Databricks job trigger in specific times</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90912#M38026</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have a Databricks notebook that processes data and generates a list of JSON objects called "list_json". Each JSON object contains an item called "time_to_send" (in UTC datetime format). I want to find the best way to send these JSON messages in a POST request within 1 hour before the "time_to_send". What is the best approach to achieve this?&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Sep 2024 13:53:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90912#M38026</guid>
      <dc:creator>dbx_deltaSharin</dc:creator>
      <dc:date>2024-09-18T13:53:41Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks job trigger in specific times</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90929#M38030</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/91173"&gt;@dbx_deltaSharin&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;You can write python function that will consume this list_json as argument and send post request for each object inside list. Since you need to send request within an hour you can use python multiprocessing or asyncio library to make it faster.&amp;nbsp;&lt;/P&gt;&lt;P&gt;But it depends of how many objects you have in your list etc&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Sep 2024 15:13:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90929#M38030</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-09-18T15:13:34Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks job trigger in specific times</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90964#M38044</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/91173"&gt;@dbx_deltaSharin&lt;/a&gt;&amp;nbsp;,&lt;BR /&gt;&lt;BR /&gt;Additionally to&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp;,&amp;nbsp;if you're using Azure, you might consider an architecture where, instead of sending the request directly to your API, you send a message to an Azure Queue or Service Bus. Then, an Azure Function with a Queue Trigger can pick up the message and send it to the API. This approach enhances scalability and reliability because Azure Functions can process multiple requests concurrently and scale automatically based on demand. This can be achieved with other cloud providers as they offer similar services.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Sep 2024 20:03:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90964#M38044</guid>
      <dc:creator>filipniziol</dc:creator>
      <dc:date>2024-09-18T20:03:45Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks job trigger in specific times</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90995#M38056</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;Thank you for your responses to my question.&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;, if I understood correctly, your suggestion is based on running the Databricks job in continuous mode. However, this might incur significant costs if the cluster is running every hour.&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/117376"&gt;@filipniziol&lt;/a&gt;, your proposal seems like a viable solution. I would just like to get a clearer idea of the associated costs to be able to compare the two options.&lt;/P&gt;&lt;P&gt;For clarification, the initial notebook is designed to run once a day to update and compute the JSON list. Another notebook is needed to process this JSON data and handle the post-processing, starting one hour before the "time_to_send."&lt;/P&gt;</description>
      <pubDate>Thu, 19 Sep 2024 06:30:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-trigger-in-specific-times/m-p/90995#M38056</guid>
      <dc:creator>dbx_deltaSharin</dc:creator>
      <dc:date>2024-09-19T06:30:52Z</dc:date>
    </item>
  </channel>
</rss>

