I have a Delta Table in Databricks with a column of struct type (containing a field of type date) and a column of type date:
create table date_struct (s struct<d:date>, d date, s_json string);
insert into date_struct (s, d, s_json) values (
named_struct('d', '2020-01-01'),
'2020-02-01',
to_json(named_struct('d', '2020-01-01')))
When importing this date_struct table from Power BI, I get the following result (the s column is mapped to a string and the d column is mapped to a date):
s d s_json
{"d":2020-01-01} | 01/02/2020 | {"d":"2020-01-01"} |
The value of the s column is a malformed JSON format (missing quotes on date 2020-01-01). As the JSON is malformed, the Power BI function Json.Document() cannot be used to convert the s string column into a JSON document e.g. when adding a new column. However, the same conversion works without error on the pre-serialized s_json column.
let
Source = Databricks.Catalogs("xxx.azuredatabricks.net", "/sql/1.0/warehouses/yyy", [Catalog = "", Database = ""]),
my_Database = Source{[Name="my",Kind="Database"]}[Data],
delta_tutorial_Schema = my_Database{[Name="myschema",Kind="Schema"]}[Data],
ds1 = delta_tutorial_Schema{[Name="date_struct",Kind="Table"]}[Data],
ds2 = Table.AddColumn(ds1, "record_from_s", each Json.Document([s])),
ds3 = Table.AddColumn(ds2, "record_from_s_json", each Json.Document([s_json]))
in
ds3
gives the following error:
s d record_from_s record_from_s_json
{"d":2020-01-01} | 01/02/2020 | Error | Record |
DataFormat.Error: We found extra characters at the end of the JSON input.
Details:
Value=-
Position=9
What can I do to access the s.d field ?
Note that the above is a minimal example in order to reproduce the problem. The real use-case is another table with a much more complex struct and no pre-serialized s_json column. Thus, I want a reliable solution, not something like "use Json.Document() on the s_json column" or "add quotes on the s column with the Replace function".