<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Parametrizing query for DEEP CLONE in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/parametrizing-query-for-deep-clone/m-p/75954#M35115</link>
    <description>&lt;P&gt;Update: Hey moderator, I've removed the link to the Bobby tables XKCD to reassure that this post is not spam &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Hi, I'm somehow unable to write a parametrized query to create a DEEP CLONE. I'm trying really hard to avoid using string interpolation (to prevent SQL injection), to no avail.&lt;/P&gt;&lt;P&gt;I'm able to run this unparametrized query:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;CREATE OR REPLACE TABLE catalog_1.schema_2.table_1
DEEP CLONE catalog_1.schema_1.table_1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now, what I'd really like to do would be to parametrize the entities involved, so I can wrap this with a function that clones based on the parameters I provide.&lt;/P&gt;&lt;P&gt;Now, according to the documentation, since the query uses DDL, I should use an &lt;A href="https://docs.databricks.com/en/sql/language-manual/sql-ref-names-identifier-clause.html" target="_self"&gt;IDENTIFIER clause with named parameter markers&lt;/A&gt; to enable a dynamic reference to the tables. I've extrapolated the third example in the linked documentation, but it just doesn't work&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":face_screaming_in_fear:"&gt;😱&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;def clone_table(
    source_catalog,
    source_schema,
    source_table,
    target_catalog,
    target_schema,
    target_table,
):
    _clone_table_query = """
        CREATE OR REPLACE TABLE IDENTIFIER(:target_catalog || '.' || :target_schema || '.' || :target_table)
        DEEP CLONE IDENTIFIER(:source_catalog || '.' || :source_schema || '.' || :source_table)
    """
    spark.sql(
        _clone_table_query,
        {
            "source_catalog": source_catalog,
            "source_schema": source_schema,
            "source_table": source_table,
            "target_catalog": target_catalog,
            "target_schema": target_schema,
            "target_table": target_table,
        },
    )&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then I try to execute the function with the values I had in the unparametrized query, but I get an error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;clone_table("catalog_1", "schema_1", "table_1", "catalog_1", "schema_2", "table_1")&lt;/LI-CODE&gt;&lt;LI-CODE lang="python"&gt;[DELTA_CLONE_UNSUPPORTED_SOURCE] Unsupported DEEP clone source ''PlanWithUnresolvedIdentifier concat(concat(concat(concat(catalog_1, .), schema_1), .), table_1), org.apache.spark.sql.catalyst.parser.AstBuilder$$Lambda$5725/1148071365@723da234
', whose format is Unknown.
The supported formats are 'delta', 'iceberg' and 'parquet'. SQLSTATE: 0AKDC&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is it just me, or is there really no way around string interpolation and Bobby tables?&lt;/P&gt;</description>
    <pubDate>Thu, 27 Jun 2024 14:43:59 GMT</pubDate>
    <dc:creator>camilo_s</dc:creator>
    <dc:date>2024-06-27T14:43:59Z</dc:date>
    <item>
      <title>Parametrizing query for DEEP CLONE</title>
      <link>https://community.databricks.com/t5/data-engineering/parametrizing-query-for-deep-clone/m-p/75954#M35115</link>
      <description>&lt;P&gt;Update: Hey moderator, I've removed the link to the Bobby tables XKCD to reassure that this post is not spam &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Hi, I'm somehow unable to write a parametrized query to create a DEEP CLONE. I'm trying really hard to avoid using string interpolation (to prevent SQL injection), to no avail.&lt;/P&gt;&lt;P&gt;I'm able to run this unparametrized query:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;CREATE OR REPLACE TABLE catalog_1.schema_2.table_1
DEEP CLONE catalog_1.schema_1.table_1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now, what I'd really like to do would be to parametrize the entities involved, so I can wrap this with a function that clones based on the parameters I provide.&lt;/P&gt;&lt;P&gt;Now, according to the documentation, since the query uses DDL, I should use an &lt;A href="https://docs.databricks.com/en/sql/language-manual/sql-ref-names-identifier-clause.html" target="_self"&gt;IDENTIFIER clause with named parameter markers&lt;/A&gt; to enable a dynamic reference to the tables. I've extrapolated the third example in the linked documentation, but it just doesn't work&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":face_screaming_in_fear:"&gt;😱&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;def clone_table(
    source_catalog,
    source_schema,
    source_table,
    target_catalog,
    target_schema,
    target_table,
):
    _clone_table_query = """
        CREATE OR REPLACE TABLE IDENTIFIER(:target_catalog || '.' || :target_schema || '.' || :target_table)
        DEEP CLONE IDENTIFIER(:source_catalog || '.' || :source_schema || '.' || :source_table)
    """
    spark.sql(
        _clone_table_query,
        {
            "source_catalog": source_catalog,
            "source_schema": source_schema,
            "source_table": source_table,
            "target_catalog": target_catalog,
            "target_schema": target_schema,
            "target_table": target_table,
        },
    )&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then I try to execute the function with the values I had in the unparametrized query, but I get an error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;clone_table("catalog_1", "schema_1", "table_1", "catalog_1", "schema_2", "table_1")&lt;/LI-CODE&gt;&lt;LI-CODE lang="python"&gt;[DELTA_CLONE_UNSUPPORTED_SOURCE] Unsupported DEEP clone source ''PlanWithUnresolvedIdentifier concat(concat(concat(concat(catalog_1, .), schema_1), .), table_1), org.apache.spark.sql.catalyst.parser.AstBuilder$$Lambda$5725/1148071365@723da234
', whose format is Unknown.
The supported formats are 'delta', 'iceberg' and 'parquet'. SQLSTATE: 0AKDC&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is it just me, or is there really no way around string interpolation and Bobby tables?&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2024 14:43:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/parametrizing-query-for-deep-clone/m-p/75954#M35115</guid>
      <dc:creator>camilo_s</dc:creator>
      <dc:date>2024-06-27T14:43:59Z</dc:date>
    </item>
  </channel>
</rss>

