cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

CREATE view USING json and *include* _metadata, _rescued_data

ChristianRRL
Valued Contributor

Title may be self-explanatory. Basically, I'm curious to ask if it's possible (and if so how) to add `_metadata` and `_rescued_data` fields to a view "using json".

e.g.

 

%sql

CREATE OR REPLACE VIEW entity_view
USING json
OPTIONS (path="/.../.*json",multiline=true)

 

Below I show it's possible to do this with `read_files` in Spark SQL:

ChristianRRL_0-1731949214474.png

Below I show an error trying to query these fields.. not sure if I'm doing something wrong:

ChristianRRL_1-1731949348303.png

 

 

4 REPLIES 4

ChristianRRL
Valued Contributor

Forgot to add this as reference:

The JSON Files documentation doesn't seem to show any OPTIONS that may enable the functionality that I'm looking for.. But please feel free to correct me if there's a way to achieve what I'm looking for that I may be overlooking!

Nam_Nguyen
Databricks Employee
Databricks Employee

Hi @ChristianRRL , I'll be looking into this, and I'll get back to you with an answer

Nam_Nguyen
Databricks Employee
Databricks Employee

Hi @ChristianRRL , as a first quick look, could you please try to create a PySpark dataframe with the _metadata and _rescued_data columns, query the dataframe to make sure you can see those columns, and then create a view using this dataframe?

akhil393
Databricks Employee
Databricks Employee

Hi @ChristianRRL You can still use the same method read_files when creating the view, I see that you are using classic hive style reader instead of using the read_files in the actual view definition of sql and you don't need to use spark.sql, please see below. 

akhil393_0-1732290590854.jpeg

Example code: 

create or replace view json_view
as
SELECT _metadata, *
FROM read_files(
's3://********/_delta_log/*.json',
format => 'json')

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group