Databricks Community

merca · ‎06-03-2022

I have schema:

|    |-- costCentres: struct (nullable = true)
 |    |    |-- dimension1: struct (nullable = true)
 |    |    |    |-- name: string (nullable = true)
 |    |    |    |-- value: string (nullable = true)
 |    |    |-- dimension10: struct (nullable = true)
 |    |    |    |-- name: string (nullable = true)
 |    |    |    |-- value: string (nullable = true)

When I use dataframe to select and save:

df = df_positions.selectExpr(
    "positions.costCentres.dimension1.value as u_kb01",
    "positions.costCentres.dimension10.value as u_kb10",
    "positions.costCentres.dimension2.value as u_kb02",
    "positions.costCentres.dimension3.value as u_kb03",
    "positions.costCentres.dimension4.value as u_kb04",
    "positions.costCentres.dimension5.value as u_kb05",
    "positions.costCentres.dimension6.value as u_kb06",
    "positions.costCentres.dimension7.value as u_kb07",
    "positions.costCentres.dimension8.value as u_kb08",
    "positions.costCentres.dimension9.value as u_kb09",
).distinct()
 
df.write.saveAsTable("test_costcenters")
df.write.save("/temp/test_costcenters")

I get required result and I'm happy.

When I do the same in Delta Live tables:

def gold_costcenter(): 
    return df_positions.selectExpr(
                            "positions.costCentres.dimension1.value as u_kb01",
                            "positions.costCentres.dimension10.value as u_kb10",
                            "positions.costCentres.dimension2.value as u_kb02",
                            "positions.costCentres.dimension3.value as u_kb03",
                            "positions.costCentres.dimension4.value as u_kb04",
                            "positions.costCentres.dimension5.value as u_kb05",
                            "positions.costCentres.dimension6.value as u_kb06",
                            "positions.costCentres.dimension7.value as u_kb07",
                            "positions.costCentres.dimension8.value as u_kb08",
                            "positions.costCentres.dimension9.value as u_kb09",
                        ).distinct()

I get an error:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true), StructField(value,StringType,true)
at org.apache.spark.sql.errors.QueryCompilationErrors$.ambiguousReferenceToFieldsError(QueryCompilationErrors.scala:1587)

Why??? And how to resolve this?

PeteC · ‎06-28-2022

I've got the same problem - but using a SQL Select statement (with some explodes).

PeteC · ‎06-30-2022

A colleague also having the same issue. He thinks he might be close to a solution. I'll update if he does find one.

Databricks Community

DLT schema ambiguity

Join Us as a Local Community Builder!

PSA: Community Edition retires on January 1, 2026. Move to the Free Edition today to keep your work.

🎤 Call for Presentations: Data + AI Summit 2026 is Open!

Last Chance: Help Shape the 2026 Data + AI Summit | Win a Full Conference Pass

🌟 Community Pulse: Your Weekly Roundup! December 05 – 11, 2025

Jaipur Usergroup First Virtual Meetup: AI/BI Genie + Data Science Careers — 19 Dec | 6 PM IST