<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101252#M40597</link>
    <description>&lt;P&gt;1) Rather than a CTE you might be better served by creating another dataframe and querying from that.&amp;nbsp; Dataframes are more native to the Spark platform and you can have more than one in a notebook.&lt;/P&gt;&lt;P&gt;2) The DB-SQL language reference is&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/" target="_blank"&gt;SQL language reference - Azure Databricks - Databricks SQL | Microsoft Learn&lt;/A&gt;.&amp;nbsp; Regarding TOP/LIMIT, almost every SQL dialect has its own way of limiting results but LIMIT is common in open source databases like MySQL and PostgreSQL, which is probably why it was chosen for DB-SQL.&lt;/P&gt;</description>
    <pubDate>Fri, 06 Dec 2024 15:28:36 GMT</pubDate>
    <dc:creator>Rjdudley</dc:creator>
    <dc:date>2024-12-06T15:28:36Z</dc:date>
    <item>
      <title>need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7954#M3688</link>
      <description>&lt;P&gt;Hi amazing community folks,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Feel free to share your experience or knowledge regarding below questions:-&lt;/P&gt;&lt;P&gt;1.) Can we pass a CTE sql statement into spark jdbc? i tried to do it i couldn't but i can pass normal sql (Select * from ) and it works. i heard that in spark 3.4, it should be available, is it true? anyone faced it? &lt;/P&gt;&lt;P&gt;2.) anyone has a list of sql function/syntaxes comparison handy. for eg:-&lt;/P&gt;&lt;P&gt;(Top 1 *) works on sql server but its doesn't in ADB notebooks (we need to use limit)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in advance &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Mar 2023 17:03:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7954#M3688</guid>
      <dc:creator>Jyo777</dc:creator>
      <dc:date>2023-03-10T17:03:59Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7955#M3689</link>
      <description>&lt;P&gt;@Jyoti j​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1) Yes, it is possible to pass a CTE (Common Table Expression) SQL statement into Spark JDBC. However, the ability to pass CTEs through Spark JDBC depends on the version of Spark you are using.&lt;/P&gt;&lt;P&gt;In Spark 2.x versions, CTEs are not supported in Spark JDBC. However, in Spark 3.x versions, CTEs are supported. Therefore, if you are using Spark 3.4, you should be able to pass a CTE SQL statement through Spark JDBC. To use CTEs with Spark JDBC in Spark 3.4, you need to make sure that you use the "subquery" option in the JDBC options&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here's an example Python code snippet to use Common Table Expressions (CTEs) with Spark JDBC in Spark 3.4:'&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
&amp;nbsp;
# create a SparkSession
spark = SparkSession.builder \
    .appName("CTE with JDBC in Spark 3.4") \
    .getOrCreate()
&amp;nbsp;
# configure Spark to use the JDBC driver
spark.conf.set("spark.sql.catalog.jdbc.driver", "com.mysql.jdbc.Driver")
&amp;nbsp;
# create a temporary view using a CTE
spark.sql("""
    WITH my_cte AS (
        SELECT col1, col2
        FROM my_table
        WHERE col3 = 'some_value'
    )
    SELECT col1, AVG(col2) as avg_col2
    FROM my_cte
    GROUP BY col1
""").createOrReplaceTempView("my_temp_view")
&amp;nbsp;
# use the temporary view to query a remote database via JDBC
jdbc_df = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:mysql://localhost:3306/my_db") \
    .option("dbtable", "my_temp_view") \
    .option("user", "my_username") \
    .option("password", "my_password") \
    .load()
&amp;nbsp;
# show the results
jdbc_df.show()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;2) Some examples are as below &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Top function:
SQL Server: SELECT TOP 1 * FROM my_table
Azure Databricks: SELECT * FROM my_table LIMIT 1
&amp;nbsp;
Date functions:
SQL Server: DATEADD(day, 7, my_date) to add 7 days to a date
Azure Databricks: DATE_ADD(my_date, INTERVAL 7 DAYS) to add 7 days to a date
&amp;nbsp;
Substring function:
SQL Server: SUBSTRING(my_string, 1, 3) to get the first 3 characters of a string
Azure Databricks: SUBSTR(my_string, 1, 3) to get the first 3 characters of a string
&amp;nbsp;
String concatenation:
SQL Server: SELECT 'Hello ' + 'world' to concatenate two strings
Azure Databricks: SELECT CONCAT('Hello ', 'world') to concatenate two strings
&amp;nbsp;
Date formatting:
SQL Server: SELECT FORMAT(my_date, 'dd/MM/yyyy') to format a date as dd/MM/yyyy
Azure Databricks: SELECT DATE_FORMAT(my_date, 'dd/MM/yyyy') to format a date as dd/MM/yyyy&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Mar 2023 07:54:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7955#M3689</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-03-13T07:54:59Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7957#M3691</link>
      <description>&lt;P&gt;hey thanks so much for putting all information for me...&lt;/P&gt;&lt;P&gt;1.) question :- r u sure about spark 2* version? we currently have 3.* but still cant do it. just to clarify our table coming from jdbc connection directly.&lt;/P&gt;&lt;P&gt;&amp;nbsp;for eg:- my_table is coming from a  jdbc connection to oracle or sql server&lt;/P&gt;&lt;P&gt;2.) regarding sql syntax : thanks &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Mar 2023 20:52:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7957#M3691</guid>
      <dc:creator>Jyo777</dc:creator>
      <dc:date>2023-03-21T20:52:53Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7958#M3692</link>
      <description>&lt;P&gt;@Jyoti j​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;can you create a temporary view using your CTE statement and then query that view using Spark JDBC. Here's an example of how you can create a temporary view using a CTE statement:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.sql("WITH cte AS (SELECT * FROM table_name WHERE column_name = 'some_value') \
          SELECT * FROM cte").createOrReplaceTempView("temp_view_name")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 02 Apr 2023 04:48:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/7958#M3692</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-02T04:48:18Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/86208#M37298</link>
      <description>&lt;P&gt;&lt;SPAN&gt;CTE expressions are supported with the `prepareQuery` option.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html" target="_blank"&gt;https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;A prefix that will form the final query together with&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;query&lt;/CODE&gt;&lt;SPAN&gt;. As the specified&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;query&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;will be parenthesized as a subquery in the&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;FROM&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;clause and some databases do not support all clauses in subqueries, the&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;prepareQuery&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;property offers a way to run such complex queries. As an example, spark will issue a query of the following form to the JDBC Source.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;CODE&gt;&amp;lt;prepareQuery&amp;gt; SELECT &amp;lt;columns&amp;gt; FROM (&amp;lt;user_specified_query&amp;gt;) spark_gen_alias&lt;/CODE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Below are a couple of examples.&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;MSSQL Server does not accept&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;WITH&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;clauses in subqueries but it is possible to split such a query to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;prepareQuery&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;query&lt;/CODE&gt;:&lt;BR /&gt;&lt;CODE&gt;spark.read.format("jdbc")&lt;BR /&gt;.option("url", jdbcUrl)&lt;BR /&gt;.option("prepareQuery", "WITH t AS (SELECT x, y FROM tbl)")&lt;BR /&gt;.option("query", "SELECT * FROM t WHERE x &amp;gt; 10")&lt;BR /&gt;.load()&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;MSSQL Server does not accept temp table clauses in subqueries but it is possible to split such a query to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;prepareQuery&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;query&lt;/CODE&gt;:&lt;BR /&gt;&lt;CODE&gt;spark.read.format("jdbc")&lt;BR /&gt;.option("url", jdbcUrl)&lt;BR /&gt;.option("prepareQuery", "(SELECT * INTO #TempTable FROM (SELECT * FROM tbl) t)")&lt;BR /&gt;.option("query", "SELECT * FROM #TempTable")&lt;BR /&gt;.load()&lt;/CODE&gt;&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 29 Aug 2024 07:00:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/86208#M37298</guid>
      <dc:creator>vijaypavann-db</dc:creator>
      <dc:date>2024-08-29T07:00:54Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101189#M40578</link>
      <description>&lt;P&gt;Nope prepareQuery doesn't wok either&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Dec 2024 09:25:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101189#M40578</guid>
      <dc:creator>csanjay100</dc:creator>
      <dc:date>2024-12-06T09:25:15Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101252#M40597</link>
      <description>&lt;P&gt;1) Rather than a CTE you might be better served by creating another dataframe and querying from that.&amp;nbsp; Dataframes are more native to the Spark platform and you can have more than one in a notebook.&lt;/P&gt;&lt;P&gt;2) The DB-SQL language reference is&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/" target="_blank"&gt;SQL language reference - Azure Databricks - Databricks SQL | Microsoft Learn&lt;/A&gt;.&amp;nbsp; Regarding TOP/LIMIT, almost every SQL dialect has its own way of limiting results but LIMIT is common in open source databases like MySQL and PostgreSQL, which is probably why it was chosen for DB-SQL.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Dec 2024 15:28:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101252#M40597</guid>
      <dc:creator>Rjdudley</dc:creator>
      <dc:date>2024-12-06T15:28:36Z</dc:date>
    </item>
    <item>
      <title>Re: need help with Azure Databricks questions on CTE and SQL syntax within notebooks</title>
      <link>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101298#M40620</link>
      <description>&lt;P&gt;Not a comparison, but there is a DB-SQL cheatsheet at&amp;nbsp;&lt;A href="https://www.databricks.com/sites/default/files/2023-09/databricks-sql-cheatsheet.pdf/" target="_blank"&gt;https://www.databricks.com/sites/default/files/2023-09/databricks-sql-cheatsheet.pdf/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Dec 2024 02:31:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-help-with-azure-databricks-questions-on-cte-and-sql-syntax/m-p/101298#M40620</guid>
      <dc:creator>Rjdudley</dc:creator>
      <dc:date>2024-12-07T02:31:59Z</dc:date>
    </item>
  </channel>
</rss>

