Create table from json and flatten in the same SQL

David_Billa — Mon, 06 Jan 2025 14:39:33 GMT

Any help in writing SQL to create the table using JSON file and flatten in the same step? As I'm new to JSON it would be nice if someone can give the heads up by referencing to any document or help provide the recommended the solution.

Sample JSON file is like,

{"Clnt":"95901","NaDt":"2023-05-04","PrCd":"ET","IFCd":"RDM121884","GLId":"1014765", "Prices":[{"Ord":5,"Vndr":"Test","Prc":45850.00,"Cur":"RW","Typ":"Bidd","Dt":"2023-05-04","Cmnt":"Test"}, {"Ord":6,"Expr":"Test","Prc":45950.00,"Cur":"RW","Typ":"NA","Dt":"2023-06-04","Cmnt":"New"}]} {"Clnt":"95903","NaDt":"20223-05-04","PrCd":"ET","IFCd":"RDM121884","GLId":"1014765", "Prices":[{"Ord":8,"Vndr":"Test","Prc":45750.00,"Cur":"RW","Typ":"Test","Dt":"2022-05-04","Cmnt":"others"}]}

Re: Create table from json and flatten in the same SQL

Walter_C — Mon, 06 Jan 2025 14:42:35 GMT

To create a table directly from a JSON file and flatten it using SQL in Databricks, you can use the CREATE TABLE statement with the USING JSON clause. However, SQL alone does not provide a direct way to flatten nested JSON structures. You would typically need to use a combination of SQL and DataFrame operations to achieve this.

Here is an example of how you can create a table from a JSON file using SQL:

-- Create a table from a JSON file
CREATE TABLE your_table_name
USING JSON
OPTIONS (path 'path/to/your/json/file');

In this example:

Replace 'path/to/your/json/file' with the actual path to your JSON file.

To flatten the nested JSON structure, you would need to use DataFrame operations in combination with SQL. Here is an example of how you can achieve this using PySpark:


# Read the JSON file into a DataFrame
df = spark.read.json("path/to/your/json/file")

# Flatten the nested JSON structure
from pyspark.sql.functions import explode
df_flattened = df.withColumn("Price", explode(df.Prices)).select("Clnt", "NaDt", "PrCd", "IFCd", "GLId", "Price.*")

# Create a temporary view from the DataFrame
df_flattened.createOrReplaceTempView("temp_view")

# Use SQL to create a table from the view
spark.sql("""
    CREATE TABLE your_table_name AS
    SELECT * FROM temp_view
""")

topic Re: Create table from json and flatten in the same SQL in Data Engineering

Create table from json and flatten in the same SQL

Re: Create table from json and flatten in the same SQL