Team ,
I am trying understand how the parquet files and JSON under the delta log folder stores the data behind the scenes
Table Creation:
from delta.tables import *
DeltaTable.create(spark) \
.tableName("employee") \
.addColumn("id", "INT") \
.addColumn("name", "STRING") \
.addColumn("dept", "STRING")\
.addColumn("salary", "INT") \
.location("/FileStore/tables/delta/demo2") \
.execute()
Step 2: %sql
INSERT INTO employee values(100,"Ram","CSE",1000)
Step 3:
%sql
select * from delta.`/FileStore/tables/delta/demo2`
Note: I made 2 inserts , so 2 parquet files
Challenge:
I am trying to read the JSON, CRC and Parquet files to see the contents in it . But I am getting the errors
Output of this command give me the structure of a JSON , not the actual data stored
Parquet file reading throws this error .
Note: My cluster is running with DBR 12.2 LTS