I have schema:
| |-- costCentres: struct (nullable = true)
| | |-- dimension1: struct (nullable = true)
| | | |-- name: string (nullable = true)
| | | |-- value: string (nullable = true)
| | |-- dimension10: struct (nullable = true)
| | | |-- name: string (nullable = true)
| | | |-- value: string (nullable = true)
When I use dataframe to select and save:
df = df_positions.selectExpr(
"positions.costCentres.dimension1.value as u_kb01",
"positions.costCentres.dimension10.value as u_kb10",
"positions.costCentres.dimension2.value as u_kb02",
"positions.costCentres.dimension3.value as u_kb03",
"positions.costCentres.dimension4.value as u_kb04",
"positions.costCentres.dimension5.value as u_kb05",
"positions.costCentres.dimension6.value as u_kb06",
"positions.costCentres.dimension7.value as u_kb07",
"positions.costCentres.dimension8.value as u_kb08",
"positions.costCentres.dimension9.value as u_kb09",
).distinct()
df.write.saveAsTable("test_costcenters")
df.write.save("/temp/test_costcenters")
I get required result and I'm happy.
When I do the same in Delta Live tables:
def gold_costcenter():
return df_positions.selectExpr(
"positions.costCentres.dimension1.value as u_kb01",
"positions.costCentres.dimension10.value as u_kb10",
"positions.costCentres.dimension2.value as u_kb02",
"positions.costCentres.dimension3.value as u_kb03",
"positions.costCentres.dimension4.value as u_kb04",
"positions.costCentres.dimension5.value as u_kb05",
"positions.costCentres.dimension6.value as u_kb06",
"positions.costCentres.dimension7.value as u_kb07",
"positions.costCentres.dimension8.value as u_kb08",
"positions.costCentres.dimension9.value as u_kb09",
).distinct()
I get an error:
org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true)
at org.apache.spark.sql.errors.QueryCompilationErrors$.ambiguousReferenceToFieldsError(QueryCompilationErrors.scala:1587)
Why??? And how to resolve this?