<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Couple of Delta Lake questions in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12477#M7277</link>
    <description>&lt;P&gt;Hi @Jay Allen​,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.&lt;/P&gt;</description>
    <pubDate>Fri, 29 Jul 2022 23:36:23 GMT</pubDate>
    <dc:creator>jose_gonzalez</dc:creator>
    <dc:date>2022-07-29T23:36:23Z</dc:date>
    <item>
      <title>Couple of Delta Lake questions</title>
      <link>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12473#M7273</link>
      <description>&lt;P&gt;Hey guys,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We're considering Delta Lake as the storage for our project and have a couple questions.  The first one is what's the pricing for Delta Lake - can't seem to find a page that says x amount costs y.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The second question is more technical - if we want to use the python library to access our Delta Lake data instead of spark, do we have to convert the Delta Lake to a Pandas dataframe?  This blog  seems to say so &lt;A href="https://databricks.com/blog/2020/12/22/natively-query-your-delta-lake-with-scala-java-and-python.html" target="test_blank"&gt;https://databricks.com/blog/2020/12/22/natively-query-your-delta-lake-with-scala-java-and-python.html&lt;/A&gt;.  Our concern is our Delta Lake will be many GBs of data and it won't fit in a single Pandas dataframe.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Jay&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jul 2022 15:41:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12473#M7273</guid>
      <dc:creator>jayallenmn</dc:creator>
      <dc:date>2022-07-26T15:41:27Z</dc:date>
    </item>
    <item>
      <title>Re: Couple of Delta Lake questions</title>
      <link>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12474#M7274</link>
      <description>&lt;P&gt;delta lake itself is free.  It is a file format.  But you will have to pay for storage and compute of course.&lt;/P&gt;&lt;P&gt;If you want to use Databricks with delta lake, it will not be free unless you use the community edition.&lt;/P&gt;&lt;P&gt;Depending on what you are planning to do, the cost can be very low to very high.&lt;/P&gt;&lt;P&gt;You can use delta lake without databricks btw.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;About your 2nd question: pandas is indeed an option.  And your concern is exactly why distributed data processing frameworks like Spark were created.&lt;/P&gt;&lt;P&gt;If you want to avoid using Spark, you might wanna look into Dask or Ray.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2022 11:03:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12474#M7274</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-07-27T11:03:02Z</dc:date>
    </item>
    <item>
      <title>Re: Couple of Delta Lake questions</title>
      <link>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12475#M7275</link>
      <description>&lt;P&gt;Thanks @Werner Stinckens​&amp;nbsp;- would you recommend processing delta lake data with databricks/spark?&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2022 17:27:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12475#M7275</guid>
      <dc:creator>jayallenmn</dc:creator>
      <dc:date>2022-07-27T17:27:08Z</dc:date>
    </item>
    <item>
      <title>Re: Couple of Delta Lake questions</title>
      <link>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12476#M7276</link>
      <description>&lt;P&gt;totally!&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2022 11:01:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12476#M7276</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-07-28T11:01:26Z</dc:date>
    </item>
    <item>
      <title>Re: Couple of Delta Lake questions</title>
      <link>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12477#M7277</link>
      <description>&lt;P&gt;Hi @Jay Allen​,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jul 2022 23:36:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/couple-of-delta-lake-questions/m-p/12477#M7277</guid>
      <dc:creator>jose_gonzalez</dc:creator>
      <dc:date>2022-07-29T23:36:23Z</dc:date>
    </item>
  </channel>
</rss>

