Hi!
I have a delta table and a process that reading a stream from this table.
I need to drop the NOT NULL constraint from some of the columns of this table.
The first drop command does not affect the reading stream.
But the second command results in error:
StreamingQueryException: Detected schema change:
old schema: root
-- a: string (nullable = true)
-- b: string (nullable = true)
-- c: timestamp (nullable = false)
new schema: root
-- a: string (nullable = true)
-- b: string (nullable = false)
-- c: timestamp (nullable = false)
Full python notebook:
# Databricks notebook source
# MAGIC %sql
# MAGIC CREATE SCHEMA temp_schema
# COMMAND ----------
# MAGIC %sql
# MAGIC CREATE TABLE temp_schema.temp_stream_table (
# MAGIC a STRING NOT NULL,
# MAGIC b STRING NOT NULL,
# MAGIC c TIMESTAMP NOT NULL
# MAGIC )
# MAGIC LOCATION "dbfs:/tmp/temp_stream_table"
# COMMAND ----------
# MAGIC %sql
# MAGIC INSERT INTO temp_schema.temp_stream_table
# MAGIC VALUES ("a", "b", CURRENT_DATE), ("c", "d", CURRENT_DATE)
# COMMAND ----------
def run_stream():
return (
spark
.readStream
.option("mergeSchema", "true")
.table("temp_schema.temp_stream_table")
.writeStream
.trigger(availableNow=True)
.option("checkpointLocation", "/dbfs/tmp/temp_stream_checkpoint")
.foreachBatch(lambda df, _id: None)
.start()
.awaitTermination()
)
# COMMAND ----------
run_stream()
# COMMAND ----------
# MAGIC %sql
# MAGIC ALTER TABLE temp_schema.temp_stream_table
# MAGIC ALTER COLUMN a DROP NOT NULL;
# COMMAND ----------
run_stream()
# Success!
# COMMAND ----------
# MAGIC %sql
# MAGIC ALTER TABLE temp_schema.temp_stream_table
# MAGIC ALTER COLUMN b DROP NOT NULL;
# COMMAND ----------
run_stream()
### Failure here: StreamingQueryException: Detected schema change ...
# COMMAND ----------
# MAGIC %sql
# MAGIC DROP TABLE temp_schema.temp_stream_table;
# MAGIC DROP SCHEMA temp_schema
# COMMAND ----------
dbutils.fs.rm("dbfs:/tmp/temp_stream_checkpoint", True)
dbutils.fs.rm("dbfs:/tmp/temp_stream_table", True)
Is it a bug?