cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Create table from json and flatten in the same SQL

David_Billa
New Contributor III

Any help in writing SQL to create the table using JSON file and flatten in the same step? As I'm new to JSON it would be nice if someone can give the heads up by referencing to any document or help provide the recommended the solution.

Sample JSON file is like,

{"Clnt":"95901","NaDt":"2023-05-04","PrCd":"ET","IFCd":"RDM121884","GLId":"1014765",
"Prices":[{"Ord":5,"Vndr":"Test","Prc":45850.00,"Cur":"RW","Typ":"Bidd","Dt":"2023-05-04","Cmnt":"Test"},
          {"Ord":6,"Expr":"Test","Prc":45950.00,"Cur":"RW","Typ":"NA","Dt":"2023-06-04","Cmnt":"New"}]}
{"Clnt":"95903","NaDt":"20223-05-04","PrCd":"ET","IFCd":"RDM121884","GLId":"1014765",
"Prices":[{"Ord":8,"Vndr":"Test","Prc":45750.00,"Cur":"RW","Typ":"Test","Dt":"2022-05-04","Cmnt":"others"}]}

 

 

1 REPLY 1

Walter_C
Databricks Employee
Databricks Employee

To create a table directly from a JSON file and flatten it using SQL in Databricks, you can use the CREATE TABLE statement with the USING JSON clause. However, SQL alone does not provide a direct way to flatten nested JSON structures. You would typically need to use a combination of SQL and DataFrame operations to achieve this.

Here is an example of how you can create a table from a JSON file using SQL:

 

-- Create a table from a JSON file
CREATE TABLE your_table_name
USING JSON
OPTIONS (path 'path/to/your/json/file');
In this example:
  • Replace 'path/to/your/json/file' with the actual path to your JSON file.

To flatten the nested JSON structure, you would need to use DataFrame operations in combination with SQL. Here is an example of how you can achieve this using PySpark:


# Read the JSON file into a DataFrame
df = spark.read.json("path/to/your/json/file")

# Flatten the nested JSON structure
from pyspark.sql.functions import explode
df_flattened = df.withColumn("Price", explode(df.Prices)).select("Clnt", "NaDt", "PrCd", "IFCd", "GLId", "Price.*")

# Create a temporary view from the DataFrame
df_flattened.createOrReplaceTempView("temp_view")

# Use SQL to create a table from the view
spark.sql("""
    CREATE TABLE your_table_name AS
    SELECT * FROM temp_view
""")

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group