DBR 12.2: DeltaOptimizedWriter: Resolved attribute(s) missing from in operator
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2023 11:23 AM
After upgrading from DBR 11.3 LTS to DBR 12.2 LTS we started to observe the following error during "read from parquet and write to delta" piece of logic.
AnalysisException: Resolved attribute(s) group_id#72,display_name#73,parent_id#74,path#75,path_list#76 missing from day#178,ac_key#179,group_id#180,display_name#181,parent_id#182,path#183,path_list#184 in operator !Project [empty2null(day#178) AS day#568, empty2null(ac_key#179) AS ac_key#569, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]. Attribute(s) with the same name appear in the operation: group_id,display_name,parent_id,path,path_list. Please check if the right attribute(s) are used.;
WriteIntoDeltaCommand OutputSpec(s3://constructor-analytics-data/tables/delta_prod/item_groups,Map(),ArrayBuffer(day#178, ac_key#179, group_id#180, display_name#181, parent_id#182, path#183, path_list#184))
+- DeltaOptimizedWriter [day, ac_key], com.databricks.sql.transaction.tahoe.DeltaLog@3dc2a8b5, [spark.databricks.delta.optimize.minFileSize=268435456, spark.databricks.delta.autoCompact.maxFileSize=134217728, spark.databricks.delta.optimize.maxFileSize=268435456, spark.databricks.delta.autoCompact.minFileSize=67108864]
+- DeltaInvariantChecker [Check(EXPRESSION(('day = 2023-03-07)),('day = 2023-03-07)), Check(EXPRESSION(('ac_key = key_ZMdl8uk3o2FQ3Bc9)),('ac_key = key_ZMdl8uk3o2FQ3Bc9))]
+- !Project [empty2null(day#178) AS day#568, empty2null(ac_key#179) AS ac_key#569, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]
+- Project [day#164 AS day#178, ac_key#165 AS ac_key#179, group_id#166 AS group_id#180, display_name#167 AS display_name#181, parent_id#168 AS parent_id#182, path#169 AS path#183, path_list#170 AS path_list#184]
+- Project [day#108 AS day#164, ac_key#116 AS ac_key#165, group_id#124 AS group_id#166, display_name#132 AS display_name#167, parent_id#140 AS parent_id#168, path#148 AS path#169, path_list#156 AS path_list#170]
+- Project [day#108, ac_key#116, group_id#124, display_name#132, parent_id#140, path#148, path_list#76 AS path_list#156]
+- Project [day#108, ac_key#116, group_id#124, display_name#132, parent_id#140, path#75 AS path#148, path_list#76]
+- Project [day#108, ac_key#116, group_id#124, display_name#132, parent_id#74 AS parent_id#140, path#75, path_list#76]
+- Project [day#108, ac_key#116, group_id#124, display_name#73 AS display_name#132, parent_id#74, path#75, path_list#76]
+- Project [day#108, ac_key#116, group_id#72 AS group_id#124, display_name#73, parent_id#74, path#75, path_list#76]
+- Project [day#108, ac_key#93 AS ac_key#116, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]
+- Project [day#84 AS day#108, ac_key#93, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]
+- Project [day#84, ac_key#93, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]
+- Project [day#84, key_ZMdl8uk3o2FQ3Bc9 AS ac_key#93, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]
+- Project [2023-03-07 AS day#84, ac_key#71, group_id#72, display_name#73, parent_id#74, path#75, path_list#76]
+- Relation [day#70,ac_key#71,group_id#72,display_name#73,parent_id#74,path#75,path_list#76] parquetWeird thing here is that at !Project there's group_id#72 but the dependent Project has group_id#180 as if there's some bug in the plan. There's not joins in this pipeline, it's as simple as read + write to delta.
Do you have any idea of what can be wrong here? DeltaOptimizedWriter issue perhaps?
Sergey