<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Containerized Databricks/Spark database in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23279#M16037</link>
    <description>&lt;P&gt;the simba driver is for spark connections, I doubt it will work with a database.&lt;/P&gt;&lt;P&gt;Why would you use this driver to connect to a database in a container?  Or do you mean running Databricks in a local container?&lt;/P&gt;&lt;P&gt;If the latter: that is not available.&lt;/P&gt;</description>
    <pubDate>Fri, 08 Apr 2022 12:27:50 GMT</pubDate>
    <dc:creator>-werners-</dc:creator>
    <dc:date>2022-04-08T12:27:50Z</dc:date>
    <item>
      <title>Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23278#M16036</link>
      <description>&lt;P&gt;Hello. I'm fairly new to Databricks and Spark.&lt;/P&gt;&lt;P&gt;I have a requirement to connect to Databricks using JDBC and that works perfectly using the driver I downloaded from the Databricks website (&lt;B&gt;"com.simba.spark.jdbc.Driver")&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What I would like to do now is have a locally running instance of a database in docker that I can connect to &lt;B&gt;using the same driver&lt;/B&gt;. I'd like to automatically initialise the database by creating tables when it starts up. Very much like how you would use &lt;B&gt;docker&lt;/B&gt;-entrypoint-&lt;B&gt;initdb&lt;/B&gt;.d when creating tables on startup for Postgresql.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'd then like to insert some data and run some tests locally.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is any of this possible?&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 08:33:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23278#M16036</guid>
      <dc:creator>knight007</dc:creator>
      <dc:date>2022-04-08T08:33:32Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23279#M16037</link>
      <description>&lt;P&gt;the simba driver is for spark connections, I doubt it will work with a database.&lt;/P&gt;&lt;P&gt;Why would you use this driver to connect to a database in a container?  Or do you mean running Databricks in a local container?&lt;/P&gt;&lt;P&gt;If the latter: that is not available.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 12:27:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23279#M16037</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-04-08T12:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23280#M16038</link>
      <description>&lt;P&gt;ok so maybe I've not asked the right question.&lt;/P&gt;&lt;P&gt;At the moment we use the Simba driver to connect to databricks and we can perform sql queries.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can I achieve the same thing locally using a dockerized Databricks or spark runtime?&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 13:01:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23280#M16038</guid>
      <dc:creator>knight007</dc:creator>
      <dc:date>2022-04-08T13:01:43Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23281#M16039</link>
      <description>&lt;P&gt;Databricks: no,&lt;/P&gt;&lt;P&gt;Spark: I guess so, but it will take some effort to gather all necessary dependencies and create a container (or look for one on dockerhub)&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 13:06:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23281#M16039</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-04-08T13:06:44Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23282#M16040</link>
      <description>&lt;P&gt;If you are just running queries on tables, you could also look into something like Dremio which can be run on docker in single node.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 13:07:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23282#M16040</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-04-08T13:07:51Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23283#M16041</link>
      <description>&lt;P&gt;can I connect to that using the same Simba driver?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 13:11:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23283#M16041</guid>
      <dc:creator>knight007</dc:creator>
      <dc:date>2022-04-08T13:11:14Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23284#M16042</link>
      <description>&lt;P&gt;I don´t know if the Databricks driver is the same as the classic simba driver.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2022 14:21:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23284#M16042</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-04-08T14:21:20Z</dc:date>
    </item>
    <item>
      <title>Re: Containerized Databricks/Spark database</title>
      <link>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23285#M16043</link>
      <description>&lt;P&gt;@Gurps Bassi​&amp;nbsp;, "running instance of a database in docker" - that is hive metastore, so it just mapping to data which is usually physically on the data lake. Databricks are so much on the cloud that setting metastore locally doesn't make sense. Instead, place two Databricks workspaces, one with stage Repo (where you will make development) and another workspace with master Repo.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For those who want to develop locally in IDE, soon, the databricks tunnel for Visual Studio Code will be available.&lt;/P&gt;</description>
      <pubDate>Mon, 11 Apr 2022 20:29:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/containerized-databricks-spark-database/m-p/23285#M16043</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-04-11T20:29:20Z</dc:date>
    </item>
  </channel>
</rss>

