Hi
I have multiple json files stored in my ADLS2 and I want to create a table in which will directly read all the data from ADLS without mounting the files. When I create the table, i cannot select all the data How can i achieve this.
ADLS Path : /dwh/test/visit/visit_dt=2024-07-01
File1 : [{ "results": [ { "visit": { "id": "123456"} }, {"visit": { "id":"7890"}} ] }]
File2 : [{ "results": [ { "visit": { "id": "234567"} }, {"visit": { "id":"8901"}} ] }]
File3 : [{ "results": [ { "visit": { "id": "345678"} }, {"visit": { "id":"9012"}} ] }]
Create table :
create table my_visits
(
results ARRAY<struct<visit: STRUCT<
id: STRING
>>>, visit_dt STRING
)
USING JSON
partition by (visit_dt)
options (
path='wdbss://dwh@devdatalake.dfs.core.windows.net/test/visit",
multiline = TRUE
);
When I select all the data from the table, I get only selected records
Query : select results.visit.id from my_visits;
Output:
["123456","7890"]
["234567","8901"]
["345678","9001"]
Query: select results[0].visit.id from my_visits;
Output:
123456
234567
345678
How can i get all the visit.id in a single rows from all the files.
Any help is appreciated.
Regards