- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-15-2023 12:01 AM
@Sergey Ivanychev : I am providing you a bunch of ideas to think about, please investigate further once you read into it and rule out all the ones that you have done.
It seems like there might be a mismatch between the attributes being referenced in the code and the attributes that are actually available in the data being processed. The error message indicates that the attributes group_id#72, display_name#73, parent_id#74, path#75, path_list#76 are missing from the input table day#178, ac_key#179, group_id#180, display_name#181, parent_id#182, path#183, path_list#184. The mismatch in attribute names is also evident in the plan, where the !Project operator refers to group_id#72, display_name#73, parent_id#74, path#75, path_list#76, but the subsequent
Project operator uses group_id#166, display_name#167, parent_id#168, path#169, path_list#170 . This could be a sign of a bug in the optimizer or a discrepancy in the metadata about the schema of the input data.
You might want to investigate whether the schema of the input data has changed in any way during the upgrade process, or whether there are any other factors that could have caused the attribute names to be mismatched.
You could also try to explicitly specify the schema of the input data when reading it in, to ensure that the attributes are being correctly identified.
It also looks like the error message is indicating that there are missing attributes in the output of a project operation, which is causing a problem when writing to Delta. Specifically, the attributes "group_id", "display_name", "parent_id", "path", and "path_list" appear to be missing from the output of the project operation.
It's possible that this issue is related to the DeltaOptimizedWriter, but it's also possible that the problem is elsewhere in the pipeline. It would be helpful to review the code that is generating this pipeline to see if there are any obvious issues, such as missing or incorrect mappings.