cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Creating a table in ADB SQL for multiple JSON files and selecting all the rows from all the files

adb_newbie
New Contributor II

Hi

I have multiple json files stored in my ADLS2 and I want to create a table in which will directly read all the data from ADLS without mounting the files. When I create the table, i cannot select all the data How can i achieve this.

ADLS Path : /dwh/test/visit/visit_dt=2024-07-01
 
File1 : [{ "results": [ { "visit": { "id": "123456"} }, {"visit": { "id":"7890"}} ] }]
File2 : [{ "results": [ { "visit": { "id": "234567"} }, {"visit": { "id":"8901"}} ] }]
File3 : [{ "results": [ { "visit": { "id": "345678"} }, {"visit": { "id":"9012"}} ] }]
 
Create table :
 
create table my_visits 
(
results ARRAY<struct<visit: STRUCT<
id: STRING
>>>, visit_dt STRING
)
USING JSON
partition by (visit_dt)
options (
path='wdbss://dwh@devdatalake.dfs.core.windows.net/test/visit",
multiline = TRUE
);
 
When I select all the data from the table, I get only selected records
 
Query : select results.visit.id from my_visits;
Output: 
["123456","7890"]
["234567","8901"]
["345678","9001"]
 
Query: select results[0].visit.id from my_visits;
Output:
123456
234567
345678
 
How can i get all the visit.id in a single rows from all the files.
Any help is appreciated.
 
Regards
 

 

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group