<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Databricks - autostart from jdbc query in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/databricks-autostart-from-jdbc-query/m-p/17511#M11528</link>
    <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi team,&lt;/P&gt;
&lt;P&gt;New to Databricks and trying to understand if there is a "True" auto-start capability with Databricks. We are evaluating Databricks Delta lake as an alternative cloud based datawarehouse solution but the biggest problem I see is the inability to allow a cluster to auto-start.&lt;/P&gt;
&lt;P&gt;Even setting up a pool of idle VMs slightly improves the start up time for clusters of different sizes but there doesn't seem to be a way to auto-start a cluster upon a jdbc request from an external client. (IE business user making a BI request)&lt;/P&gt;
&lt;P&gt;What I have seen in the community is references to start up scripts which are scheduled which would be fine for known workloads but I'm referring to a scenario where the usage is unknown. So needs to be automated.&lt;/P&gt;
&lt;P&gt;Let me know if there is something obvious I'm missing but can't seem to see a solution with Databricks which means it will sit back in the Datascience/ML environment and leaving other Cloud Based Datawarehouses to run the BI workloads.&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 21 Jul 2021 08:26:27 GMT</pubDate>
    <dc:creator>nickmaco</dc:creator>
    <dc:date>2021-07-21T08:26:27Z</dc:date>
    <item>
      <title>Databricks - autostart from jdbc query</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-autostart-from-jdbc-query/m-p/17511#M11528</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi team,&lt;/P&gt;
&lt;P&gt;New to Databricks and trying to understand if there is a "True" auto-start capability with Databricks. We are evaluating Databricks Delta lake as an alternative cloud based datawarehouse solution but the biggest problem I see is the inability to allow a cluster to auto-start.&lt;/P&gt;
&lt;P&gt;Even setting up a pool of idle VMs slightly improves the start up time for clusters of different sizes but there doesn't seem to be a way to auto-start a cluster upon a jdbc request from an external client. (IE business user making a BI request)&lt;/P&gt;
&lt;P&gt;What I have seen in the community is references to start up scripts which are scheduled which would be fine for known workloads but I'm referring to a scenario where the usage is unknown. So needs to be automated.&lt;/P&gt;
&lt;P&gt;Let me know if there is something obvious I'm missing but can't seem to see a solution with Databricks which means it will sit back in the Datascience/ML environment and leaving other Cloud Based Datawarehouses to run the BI workloads.&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jul 2021 08:26:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-autostart-from-jdbc-query/m-p/17511#M11528</guid>
      <dc:creator>nickmaco</dc:creator>
      <dc:date>2021-07-21T08:26:27Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks - autostart from jdbc query</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-autostart-from-jdbc-query/m-p/17512#M11529</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Just adding on to this.&lt;/P&gt;
&lt;P&gt;Using DBeaver as a client and using a singlenode cluster and a pool of idling VM, it was possible to get the autostart time of the cluster down to 35 seconds, + 17 seconds for the query time on top to show the first 200 rows of a 500,000 record object doesn't really compare to the other Datawarehouse SaaS products out there. &lt;/P&gt;
&lt;P&gt;Even looked to make sure I was not using ML runtime.&lt;/P&gt;
&lt;P&gt;Welcome others thoughts and opinions on what approach to take or just leave as not viable.&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jul 2021 12:53:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-autostart-from-jdbc-query/m-p/17512#M11529</guid>
      <dc:creator>nickmaco</dc:creator>
      <dc:date>2021-07-21T12:53:02Z</dc:date>
    </item>
  </channel>
</rss>

