cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Runtime 10.4 LTS - AnalysisException: No such struct field id in 0, 1 after upgrading

Emiel_Smeenk
New Contributor III

Hello,

We are working to migrate to databricks runtime 10.4 LTS from 9.1 LTS but we're running into weird behavioral issues. Our existing code works up until runtime 10.3 and in 10.4 it stopped working.

Problem:

We have a nested json file that we are flattening into a spark data frame using the code below:

adaccountsdf = df.withColumn('Exp_Organizations', F.explode(F.col('organizations.organization')))\
                  .withColumn('Exp_AdAccounts', F.explode(F.col('Exp_Organizations.ad_accounts')))\
                  .select(F.col('Exp_Organizations.id').alias('organizationId'),
                                  F.col('Exp_Organizations.name').alias('organizationName'),
                                  F.col('Exp_AdAccounts.id').alias('adAccountId'),
                                  F.col('Exp_AdAccounts.name').alias('adAccountName'),
                                  F.col('Exp_AdAccounts.timezone').alias('timezone'))

Now when we query the dataframe it works when we do the following selects (hid results due to confidentiality):

display(adaccountsdf.select("*"))
 
OR
 
display(adaccountsdf)

imageWhen I display the schema of the dataframe we get the following:

root
 |-- organizationId: string (nullable = true)
 |-- organizationName: string (nullable = true)
 |-- adAccountId: string (nullable = true)
 |-- adAccountName: string (nullable = true)
 |-- timezone: string (nullable = true)

so everything looks like it should. The moment we start selecting the last 3 fields(adAccountId, adAccountName and timezone) we get the following error:

imageHowever when we select a single column it works fine:

image 

Does anyone know why this is happening? It's a very strange error that only shows up in databricks runtime 10.4. All previous runtimes incl 10.3, 10.2,10.1 and 9.1 LTS work fine. The issue seems to be caused by using the explode function on an already exploded column in the dataframe.

UPDATE:

For some reason when I run adaccountsdf.cache() before I run my select statements the issue disappears. Would still like to know what's causing this issue in runtime 10.4 but not the other ones.

1 ACCEPTED SOLUTION

Accepted Solutions

Emiel_Smeenk
New Contributor III

It seems like the issue was miraculously resolved. I did not make any code changes but everything is now running as expected.

Maybe the latest runtime 10.4 fix released on April 19th also resolved this issue unintentionally.

View solution in original post

5 REPLIES 5

Emiel_Smeenk
New Contributor III

It seems like the issue was miraculously resolved. I did not make any code changes but everything is now running as expected.

Maybe the latest runtime 10.4 fix released on April 19th also resolved this issue unintentionally.

Nirupam
New Contributor III

@Emiel Smeenkโ€‹ 

We were facing the same issue and suddenly 2022-Apr-20 onwards it resolved itself.

Question:- Is there any website where I can see/track these "patches"?

Edit: Added Question.

Nirupam
New Contributor III

@Kaniz Fatmaโ€‹ 

Your answer suffices my query. Thanks!

In addition, for fellow developers, I later noticed that these release notes are also available on the home screen of your Databricks workspace.

Nirupam
New Contributor III

@Kaniz Fatmaโ€‹ I did not ask the original question.

@Emiel Smeenkโ€‹ had asked and answered his own question stating that the issue was fixed on its own (probably due to latest patch).

Emiel_Smeenk
New Contributor III

Issue resolved on its own so selected that as the best answer for this post.

Thanks,

Emiel

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group