<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Azure Data Factory: allocate resources per Notebook in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33096#M1751</link>
    <description>&lt;P&gt;I understand that, in your case, auto-scaling will take too much time.&lt;/P&gt;&lt;P&gt;The simplest option is to use a different cluster for another notebook (and be sure that the previous cluster is terminated instantly).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Another option is to use REST API 2.0/clusters/resize to resize the cluster &lt;A href="https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/clusters#--resize" target="test_blank"&gt;https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/clusters#--resize&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There is also a magic option to do it from a notebook, and I am including a script detecting all required parameters.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
domain_name = ctx.tags().get("browserHostName").get()
cluster_id = ctx.clusterId().get()
host_token = ctx.apiToken().get()
&amp;nbsp;
requests.post(
    f'https://{domain_name}/api/2.0/clusters/resize',
    headers={'Authorization': f'Bearer {host_token}'},
    json={ "cluster_id": cluster_id, "num_workers": 2 }
  )&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 30 Aug 2022 17:07:55 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-08-30T17:07:55Z</dc:date>
    <item>
      <title>Azure Data Factory: allocate resources per Notebook</title>
      <link>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33095#M1750</link>
      <description>&lt;P&gt;I'm using Azure Data Factory to create pipeline of Databricks notebooks, &lt;/P&gt;&lt;P&gt;something like this:&lt;/P&gt;&lt;P&gt;[Notebook 1 - data pre-processing ] -&amp;gt; [Notebook 2 - model training ] -&amp;gt; [Notebook 3 - performance evaluation].&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can I write some config file, that would allow to allocate resources per Notebook (Brick)? &lt;/P&gt;&lt;P&gt;Suppose data pre-processing requires 40 workers, when performance evaluation can be done only with 1 worker.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;Giorgi&lt;/P&gt;</description>
      <pubDate>Tue, 30 Aug 2022 12:57:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33095#M1750</guid>
      <dc:creator>Giorgi</dc:creator>
      <dc:date>2022-08-30T12:57:32Z</dc:date>
    </item>
    <item>
      <title>Re: Azure Data Factory: allocate resources per Notebook</title>
      <link>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33096#M1751</link>
      <description>&lt;P&gt;I understand that, in your case, auto-scaling will take too much time.&lt;/P&gt;&lt;P&gt;The simplest option is to use a different cluster for another notebook (and be sure that the previous cluster is terminated instantly).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Another option is to use REST API 2.0/clusters/resize to resize the cluster &lt;A href="https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/clusters#--resize" target="test_blank"&gt;https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/clusters#--resize&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There is also a magic option to do it from a notebook, and I am including a script detecting all required parameters.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
domain_name = ctx.tags().get("browserHostName").get()
cluster_id = ctx.clusterId().get()
host_token = ctx.apiToken().get()
&amp;nbsp;
requests.post(
    f'https://{domain_name}/api/2.0/clusters/resize',
    headers={'Authorization': f'Bearer {host_token}'},
    json={ "cluster_id": cluster_id, "num_workers": 2 }
  )&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 30 Aug 2022 17:07:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33096#M1751</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-08-30T17:07:55Z</dc:date>
    </item>
    <item>
      <title>Re: Azure Data Factory: allocate resources per Notebook</title>
      <link>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33097#M1752</link>
      <description>&lt;P&gt;Thanks for your answer! &lt;/P&gt;&lt;P&gt;Different cluster per notebook does what I need, for now.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Solution with REST API 2.0 to resize cluster seems more flexible way to go. I guess it should be possible to create clusters on demand, from JSON configs, via curl command? &lt;/P&gt;&lt;P&gt;Ideally, I'd like to achieve ADF pipeline deployment from some code (JSON or python), where I can configure resources to be used at each step (on demand), and with packages versions to be installed at each cluster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;Giorgi&lt;/P&gt;</description>
      <pubDate>Wed, 31 Aug 2022 09:39:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/azure-data-factory-allocate-resources-per-notebook/m-p/33097#M1752</guid>
      <dc:creator>Giorgi</dc:creator>
      <dc:date>2022-08-31T09:39:43Z</dc:date>
    </item>
  </channel>
</rss>

