cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks: Change the existing schema of columns to non-nullable for a delta table using Pyspark?

Red_blue_green
New Contributor III

Hello,

I have currently a delta folder as a table with several columns that are nullable. I want to migrate data to the table and overwrite the content using Pyspark, add several new columns and make them not nullable. I have found a way to make the columns in the pyspark df as non-nullable:

non_nullable_schema = StructType([
    StructField("column1", StringType(), nullable=False),
    StructField("column2", StringType(), nullable=False),
])

# Apply the new schema to the DataFrame
non_nullable_df = spark.createDataFrame(df.rdd, non_nullable_schema)

But it seems like after I write to the existing delta destination folder and load it again, it shows that the columns are nullable again after I print the schema :

non_nullable_df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").save("/path/to/delta/files")
df_read=spark.read.format("delta").load("/path/to/delta/files")
df_read.printSchema() 

Is there any way to change an existing schema in delta to not nullable using pyspark without creating a new delta table?

2 REPLIES 2

Husky
New Contributor II

You could save the dataframe as a table instead of a delta file and then alter the table to set the columns not nullable:

table = <your_table_name>
column_name = <column_name>

non_nullable_df.write.saveAsTable(table, mode="overwrite")

spark.sql(f"ALTER TABLE {table} ALTER column {column_name} SET NOT NULL")

Make sure that there are no null values in the column which you want to make not nullable. Otherwise you will get an error.

Red_blue_green
New Contributor III

Thank you for the suggestion but my current constraint is unfortunately to work with delta files. So saving as a table would not be enough.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.