MathieuDB
Databricks Employee
Databricks Employee

Hello @thecodecache ,

Have a look the SQLGlot project: https://github.com/tobymao/sqlglot?tab=readme-ov-file#faq

It can easily transpile SQL to Spark SQL, like that:

import sqlglot
from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder.appName("SQLGlot Example").getOrCreate()

# Original SQL query
sql_query = """
SELECT bar.a, b + 1 AS b
FROM bar
JOIN baz ON bar.a = baz.a
WHERE bar.a > 1
"""

# Convert SQL to Spark SQL dialect
spark_sql = sqlglot.transpile(sql_query, read="generic", write="spark")[0]

# Create a DataFrame from the Spark SQL query
df = spark.sql(spark_sql)

# Show the resulting DataFrame
df.show()