I am using second example from Databricks` official document here: Work with workspace files. But I'm getting following error:
Question: What could be a cause of the error, and how can we fix it?
ERROR: Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the
referenced columns only include the internal corrupt record column
(named _corrupt_record by default)
Code:
%sql
SELECT * FROM json.`file:/Workspace/Users/myusername@outlook.com/myJsonFile_in_Workspace.json`;
Json file in my Databricks Workspace:
{
"header": {
"platform": "atm",
"version": "2.0"
},
"details": [
{
"abc": "3",
"def": "4"
},
{
"abc": "5",
"def": "6"
}
]
}
Remarks: Minimizing JSON to single file is one possibility. The json used in my post is for explaining the question only. The actual Json that I am using is quite large and complex - and in such cases, Apache Spark's official document recommends: `For a regular multi-line JSON file, set the multiline parameter to True` - as shown in this example. But I'm not sure how to use this option when you're reading json from a `Databrick Workspace` that's what my code above is doing.