cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Reading JSON from Databricks Workspace

Saf4Databricks
New Contributor III

I am using second example from Databricks` official document here: Work with workspace files. But I'm getting following error:

Question: What could be a cause of the error, and how can we fix it?

ERROR: Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the
referenced columns only include the internal corrupt record column
(named _corrupt_record by default)

Code:

 

%sql
SELECT * FROM json.`file:/Workspace/Users/myusername@outlook.com/myJsonFile_in_Workspace.json`;

 

Json file in my Databricks Workspace:

 

{
"header": {
"platform": "atm",
"version": "2.0"
},
"details": [
{
"abc": "3",
"def": "4"
},
{
"abc": "5",
"def": "6"
}
]
}

 

Remarks: Minimizing JSON to single file is one possibility. The json used in my post is for explaining the question only. The actual Json that I am using is quite large and complex - and in such cases, Apache Spark's official document recommends: `For a regular multi-line JSON file, set the multiline parameter to True` - as shown in this example. But I'm not sure how to use this option when you're reading json from a `Databrick Workspace` that's what my code above is doing.

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group