<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Best option for configuring Data Storage for Serverless SQL Warehouse in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120739#M2091</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/166865"&gt;@Curious-mind&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You got it. Running the COPY INTO is good for the initial load as it's optimized for bulk loads. You'll want to use Auto-loader going forward to incrementally process new rows.&lt;/P&gt;</description>
    <pubDate>Mon, 02 Jun 2025 17:14:43 GMT</pubDate>
    <dc:creator>Shua42</dc:creator>
    <dc:date>2025-06-02T17:14:43Z</dc:date>
    <item>
      <title>Best option for configuring Data Storage for Serverless SQL Warehouse</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120720#M2088</link>
      <description>&lt;P&gt;Hello!&lt;/P&gt;&lt;P&gt;I'm new to Databricks.&lt;/P&gt;&lt;P&gt;Assume, I need to migrate 2 Tb Oracle Datamart to Databricks on Azure. Serverless SQL Warehouse seems as a valid choice.&lt;/P&gt;&lt;P&gt;What is a better option ( cost vs performance) to store the data?&lt;/P&gt;&lt;P&gt;Should I upload Oracle Extracts to Azure BLOB and create External tables?&lt;/P&gt;&lt;P&gt;Or it is better to use COPY INTO FROM to create managed tables?&lt;/P&gt;&lt;P&gt;Data size will grow by ~1 Tb per year.&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jun 2025 14:04:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120720#M2088</guid>
      <dc:creator>Curious-mind</dc:creator>
      <dc:date>2025-06-02T14:04:15Z</dc:date>
    </item>
    <item>
      <title>Re: Best option for configuring Data Storage for Serverless SQL Warehouse</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120727#M2089</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/166865"&gt;@Curious-mind&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;Welcome to using Databricks! For your use case, I think creating managed tables using COPY INTO are going to be more performative which will lead to better cost scalability as well.&amp;nbsp;While external tables could initially be a bit cheaper, managed Delta tables offer significant performance and usability benefits that pay off as your data grows.&lt;/P&gt;
&lt;P&gt;Here are a few benefits that managed tables offer over external tables:&lt;/P&gt;
&lt;UL data-start="458" data-end="702"&gt;
&lt;LI data-start="458" data-end="522"&gt;
&lt;P data-start="460" data-end="522"&gt;Faster queries with indexing, caching, and Delta optimizations&lt;/P&gt;
&lt;/LI&gt;
&lt;LI data-start="523" data-end="579"&gt;
&lt;P data-start="525" data-end="579"&gt;Easier schema enforcement, versioning, and time travel&lt;/P&gt;
&lt;/LI&gt;
&lt;LI data-start="580" data-end="639"&gt;
&lt;P data-start="582" data-end="639"&gt;Seamless use with Unity Catalog, RBAC, and Serverless SQL&lt;/P&gt;
&lt;/LI&gt;
&lt;LI data-start="640" data-end="702"&gt;
&lt;P data-start="642" data-end="702"&gt;Better support for optimization (&lt;CODE data-start="675" data-end="685"&gt;OPTIMIZE&lt;/CODE&gt;, &lt;CODE data-start="687" data-end="695"&gt;VACUUM&lt;/CODE&gt;, etc.)&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Mon, 02 Jun 2025 15:38:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120727#M2089</guid>
      <dc:creator>Shua42</dc:creator>
      <dc:date>2025-06-02T15:38:27Z</dc:date>
    </item>
    <item>
      <title>Re: Best option for configuring Data Storage for Serverless SQL Warehouse</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120738#M2090</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/154481"&gt;@Shua42&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;Thank you for the prompt reply.&lt;/P&gt;&lt;P&gt;For managed tables initial load:&lt;/P&gt;&lt;P&gt;Can&amp;nbsp; I simply run a COPY command::&lt;/P&gt;&lt;P&gt;&lt;EM&gt;COPY INTO&amp;nbsp;DELTA_TABLE&amp;nbsp;FROM '&lt;SPAN&gt;abfss://container@storageAccount.dfs.core.windows.net/base/path'&lt;/SPAN&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;FILEFORMAT = CSV&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;FILES = ('f1.csv', 'f2.csv',...)&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;or it is better use Auto-Loader?&lt;/P&gt;&lt;P&gt;Some source Oracle tables can have 100M+ rows&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jun 2025 17:15:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120738#M2090</guid>
      <dc:creator>Curious-mind</dc:creator>
      <dc:date>2025-06-02T17:15:36Z</dc:date>
    </item>
    <item>
      <title>Re: Best option for configuring Data Storage for Serverless SQL Warehouse</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120739#M2091</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/166865"&gt;@Curious-mind&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You got it. Running the COPY INTO is good for the initial load as it's optimized for bulk loads. You'll want to use Auto-loader going forward to incrementally process new rows.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jun 2025 17:14:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120739#M2091</guid>
      <dc:creator>Shua42</dc:creator>
      <dc:date>2025-06-02T17:14:43Z</dc:date>
    </item>
    <item>
      <title>Re: Best option for configuring Data Storage for Serverless SQL Warehouse</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120829#M2098</link>
      <description>&lt;P&gt;If I get it right we can use Default storage for Managed Data or&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/connect/unity-catalog/cloud-storage/managed-storage#set-a-managed-storage-location-for-a-catalog" target="_blank" rel="noopener nofollow ugc"&gt;&amp;nbsp;Set a managed storage location for a catalog&lt;/A&gt;&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;CREATE CATALOG &amp;lt;catalog-name&amp;gt; MANAGED LOCATION 'abfss://&amp;lt;container-name&amp;gt;@&amp;lt;storage-account&amp;gt;.dfs.core.windows.net/&amp;lt;path&amp;gt;/&amp;lt;directory&amp;gt;';&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;What are the reasons to create our own Managed Location vs using Default one?&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jun 2025 14:10:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120829#M2098</guid>
      <dc:creator>Curious-mind</dc:creator>
      <dc:date>2025-06-03T14:10:31Z</dc:date>
    </item>
    <item>
      <title>Re: Best option for configuring Data Storage for Serverless SQL Warehouse</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120847#M2099</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/166865"&gt;@Curious-mind&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;The main benefit around creating your own managed location is just better isolation and management. It will depend on how large your data is, but if you want the data to be stored in specific locations by catalog, rather than having any new catalog created land in the root of the metastore, than specifying a managed location is what you'd want to do.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jun 2025 18:08:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/best-option-for-configuring-data-storage-for-serverless-sql/m-p/120847#M2099</guid>
      <dc:creator>Shua42</dc:creator>
      <dc:date>2025-06-03T18:08:35Z</dc:date>
    </item>
  </channel>
</rss>

