cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta file partitions

thushar
Contributor

Have one function to create files with partitions, in that the partitions are created based on metadata (getPartitionColumns) that we are keeping. In a table we have two columns that are mentioned as partition columns, say 'Team' and 'Speciality'.

While executing, partition columns are not substituted properly within the datafrme's write method and getting an error like below

AnalysisException: Partition column `"Team","Speciality"` not found in schema

But these columns are already there in the data frame. Any idea how to resolve this?

Seems like the value `"Team","Speciality" is considered as single column instead of separate columns.

def dfWrite(df, targetPath,tableName):

partitionColumn = getPartitionColumns(tableName)

# "Team", "Speciality"

df.write.option("header", True) \

        .partitionBy(partitionColumn) \

        .mode("overwrite") \

        .csv(targetPath)

4 REPLIES 4

pvignesh92
Honored Contributor

Hi Thushar,

You have not mentioned the return type of the getPartitionColumns method. You have to return the partition columns as collection Ex list ['Team', 'Speciality']

Then the below method should work.

df.write.option("header", True) \

        .partitionBy(*partitionColumn) \

        .mode("overwrite") \

        .csv(targetPath)

Kindly try.

Hi Vignesh,

Thanks, the return type was a string, and converted that to a tuple and it is working.

pvignesh92
Honored Contributor

Hi Thushar,

Please upvote and mark this as answer so that the thread will be closed

Anonymous
Not applicable

Hi @Thushar R​ 

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. 

Cheers!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group