<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: delta lake in Apache Spark in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-lake-in-apache-spark/m-p/8805#M4336</link>
    <description>&lt;P&gt;@Arun Sethia​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Yes, Delta Lake also supports custom catalogs. Delta Lake uses the Spark Catalog API, which allows for pluggable catalog implementations. You can implement your own custom catalog to use with Delta Lake.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To use a custom catalog, you can set the configuration property spark.sql.catalog.my_custom_catalog to the fully-qualified name of your custom catalog implementation. Then you can use Delta tables as usual by specifying the catalog and database in the table identifier, like so: my_custom_catalog.my_database.my_table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here's an example of how to create a custom catalog implementation for Delta Lake:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.catalog import Catalog
&amp;nbsp;
class MyCustomCatalog(Catalog):
    def __init__(self, spark_session):
        super().__init__(spark_session)
        # implementation details for your custom catalog
&amp;nbsp;
# set configuration property to use your custom catalog
spark.conf.set("spark.sql.catalog.my_custom_catalog", "com.example.MyCustomCatalog")
&amp;nbsp;
# use Delta tables with your custom catalog
df = spark.read.format("delta").table("my_custom_catalog.my_database.my_table")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;In the above example, MyCustomCatalog is a custom implementation of the Catalog class provided by Spark, and spark.sql.catalog.my_custom_catalog is set to the fully-qualified name of that implementation. Then you can use Delta tables as usual, but with the custom catalog specified in the table identifier.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope this helps you to figure out your solution!&lt;/P&gt;</description>
    <pubDate>Fri, 31 Mar 2023 15:43:02 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-03-31T15:43:02Z</dc:date>
    <item>
      <title>delta lake in Apache Spark</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-lake-in-apache-spark/m-p/8804#M4335</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As per documentation &lt;A href="https://docs.delta.io/latest/quick-start.html" target="test_blank"&gt;https://docs.delta.io/latest/quick-start.html&lt;/A&gt; , we can configure &lt;B&gt;DeltaCatalog&lt;/B&gt;&amp;nbsp;using&amp;nbsp;&lt;B&gt;spark&lt;/B&gt;.sql.&lt;B&gt;catalog&lt;/B&gt;.&lt;B&gt;spark_catalog.&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;The Iceberg supports two Catalog implementations (&lt;/B&gt;&lt;A href="https://iceberg.apache.org/docs/latest/spark-configuration/#catalogs)" target="test_blank"&gt;https://iceberg.apache.org/docs/latest/spark-configuration/#catalogs)&lt;/A&gt;&lt;B&gt;:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Replacing the session catalog (spark_catalog) - using org.apache.iceberg.spark.SparkSessionCatalog , It&amp;nbsp;adds support for Iceberg tables to Spark’s built-in catalog, and delegates to the built-in catalog for non-Iceberg tables&lt;/LI&gt;&lt;LI&gt;Custom Catalog - org.apache.iceberg.spark.SparkCatalog - supports a Hive Metastore or a Hadoop warehouse as &lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Do we have an option similar to Iceberg in Delta Lake; where we can configure a custom catalog? &lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Feb 2023 22:43:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-lake-in-apache-spark/m-p/8804#M4335</guid>
      <dc:creator>asethia</dc:creator>
      <dc:date>2023-02-23T22:43:24Z</dc:date>
    </item>
    <item>
      <title>Re: delta lake in Apache Spark</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-lake-in-apache-spark/m-p/8805#M4336</link>
      <description>&lt;P&gt;@Arun Sethia​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Yes, Delta Lake also supports custom catalogs. Delta Lake uses the Spark Catalog API, which allows for pluggable catalog implementations. You can implement your own custom catalog to use with Delta Lake.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To use a custom catalog, you can set the configuration property spark.sql.catalog.my_custom_catalog to the fully-qualified name of your custom catalog implementation. Then you can use Delta tables as usual by specifying the catalog and database in the table identifier, like so: my_custom_catalog.my_database.my_table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here's an example of how to create a custom catalog implementation for Delta Lake:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.catalog import Catalog
&amp;nbsp;
class MyCustomCatalog(Catalog):
    def __init__(self, spark_session):
        super().__init__(spark_session)
        # implementation details for your custom catalog
&amp;nbsp;
# set configuration property to use your custom catalog
spark.conf.set("spark.sql.catalog.my_custom_catalog", "com.example.MyCustomCatalog")
&amp;nbsp;
# use Delta tables with your custom catalog
df = spark.read.format("delta").table("my_custom_catalog.my_database.my_table")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;In the above example, MyCustomCatalog is a custom implementation of the Catalog class provided by Spark, and spark.sql.catalog.my_custom_catalog is set to the fully-qualified name of that implementation. Then you can use Delta tables as usual, but with the custom catalog specified in the table identifier.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope this helps you to figure out your solution!&lt;/P&gt;</description>
      <pubDate>Fri, 31 Mar 2023 15:43:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-lake-in-apache-spark/m-p/8805#M4336</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-03-31T15:43:02Z</dc:date>
    </item>
  </channel>
</rss>

