cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Conditionally create a dataframe

Braxx
Contributor II

I would like to implement a simple logic:

if Df1 is empty return Df2 else newDf = Df1.union(Df2)

May happened that Df1 is empty and the output is simply: []. In that case I do not need union.

I have it like this but getting error when creating dataframe

if len(Df1) == 0:
  Df2
else:
  newDf=Df1.union(Df2)

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

a dataframe has no 'len' method. use df.count instead.

That being said: it might be a good idea to always assign the result of your if-else to a dataframe. Makes it easier to use:

if df1.count == 0:
  newdf = df2
else:
  newdf = df1.union(df2)

Like that you know the result will always be newdf, instead of df2 or newdf.

View solution in original post

5 REPLIES 5

Kaniz_Fatma
Community Manager
Community Manager

Hi @ Braxx! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

-werners-
Esteemed Contributor III

a dataframe has no 'len' method. use df.count instead.

That being said: it might be a good idea to always assign the result of your if-else to a dataframe. Makes it easier to use:

if df1.count == 0:
  newdf = df2
else:
  newdf = df1.union(df2)

Like that you know the result will always be newdf, instead of df2 or newdf.

You were right. Thanks!

Hubert-Dudek
Esteemed Contributor III

Probably it could be achieved also in pure SPARK SQL using

if(expr1, expr2, expr3)

so expr1 we check is there rows, expr2 we return union, expr3 we return . Not sure it will work, I can check it later.

cconnell
Contributor II

Also try df.head(1).isEmpty

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!