Cannot use Delta Table columns containing struct with date fields in Power BI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-19-2024 05:23 AM
Hi everyone,
I have a Delta Table in Databricks with a column of struct type (containing a field of type date) and a column of type date:
create table date_struct (s struct<d:date>, d date, s_json string); insert into date_struct (s, d, s_json) values ( named_struct('d', '2020-01-01'), '2020-02-01', to_json(named_struct('d', '2020-01-01')))
When importing this date_struct table from Power BI, I get the following result (the s column is mapped to a string and the d column is mapped to a date):
| {"d":2020-01-01} | 01/02/2020 | {"d":"2020-01-01"} |
The value of the s column is a malformed JSON format (missing quotes on date 2020-01-01). As the JSON is malformed, the Power BI function Json.Document() cannot be used to convert the s string column into a JSON document e.g. when adding a new column. However, the same conversion works without error on the pre-serialized s_json column.
let Source = Databricks.Catalogs("xxx.azuredatabricks.net", "/sql/1.0/warehouses/yyy", [Catalog = "", Database = ""]), my_Database = Source{[Name="my",Kind="Database"]}[Data], delta_tutorial_Schema = my_Database{[Name="myschema",Kind="Schema"]}[Data], ds1 = delta_tutorial_Schema{[Name="date_struct",Kind="Table"]}[Data], ds2 = Table.AddColumn(ds1, "record_from_s", each Json.Document([s])), ds3 = Table.AddColumn(ds2, "record_from_s_json", each Json.Document([s_json])) in ds3
gives the following error:
| {"d":2020-01-01} | 01/02/2020 | Error | Record |
DataFormat.Error: We found extra characters at the end of the JSON input. Details: Value=- Position=9
What can I do to access the s.d field ?
Note that the above is a minimal example in order to reproduce the problem. The real use-case is another table with a much more complex struct and no pre-serialized s_json column. Thus, I want a reliable solution, not something like "use Json.Document() on the s_json column" or "add quotes on the s column with the Replace function".
Stackoverflow post: https://stackoverflow.com/questions/78879730/date-fields-are-serialized-without-quotes-when-imported...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2025 03:51 AM
To access the s.d field from your Delta table in Power BI, you need the content of the s column to be correctly formatted as JSON so that Power BI's Json.Document() function can parse it. Your issue arises because the default string representation of struct<d:date> in Databricks returns {"d":2020-01-01}, which is not valid JSON since string values—like dates—need to be quoted: {"d":"2020-01-01"}.
The s_json column works fine because it uses to_json(), which produces a properly quoted JSON string, unlike the default cast of struct to string.
Why This Happens
-
Databricks generates a simplified, spark-internal JSON-like representation for
structvalues when you SELECT or export them directly, not standard JSON (dates are not in quotes). -
Power BI's
Json.Document()requires strict JSON compliance: string values must be quoted.
Solutions
1. Use to_json() in Your Databricks Views/Queries
Ensure that the data you supply to Power BI for complex columns like structs is always pre-serialized to a valid JSON string.
For example, in your Databricks SQL, use:
SELECT to_json(s) as s_json, d FROM date_struct
And then let Power BI use the s_json column, as it already does in your example. This is the cleanest and most reliable approach.
2. Modify Power BI M Query to Parse the Malformed JSON
If you cannot change your Databricks view/table, you can fix the malformed JSON in Power BI before calling Json.Document(). You can substitute the date pattern without quotes, e.g.:
let
// previous steps,
FixedS = Table.TransformColumns(ds1, {
"s", each Text.Replace(Text.From(_), "2020-01-01", """2020-01-01"""), type text}
),
AddedRecord = Table.AddColumn(FixedS, "record_from_s", each Json.Document([s]))
in
AddedRecord
However, this approach is fragile because it only covers known date values and structures. It's much better to fix this at the source using to_json().
3. Project Struct Fields Directly
If you just need a field like s.d, select it directly in your SQL:
SELECT s.d, d FROM date_struct
This gives a flat table without needing JSON conversion.
Recommendation
Always use to_json() in your Databricks SQL when exporting struct columns to Power BI. This guarantees your data is valid JSON and savable as a string, which can then be parsed by Power BI's Json.Document() with no issues.