Hi @IM_01 !
I think that your issue is caused by using the rate source as a dummy empty stream.
The rate source stores its partition count in the streaming checkpoint and because numPartitions was not explicitly set it can change between runs depending on cluster size or default parallelism which causes STREAMING_RATE_SOURCE_V2_PARTITION_NUM_CHANGE_UNSUPPORTED.
However, I would not recommend this pattern in Lakeflow SDP. A pipeline table function should return a dataframe only and it should not write to an audit table with df.write.insertInto() because dataset definitions may be analyzed or even retried multiple times and side effects can be duplicated or fail unexpectedly.
So let the pipeline fail normally and use the Lakeflow pipeline event log for monitoring and if needed, you can create a separate monitoring job or query or event hook to copy error events into your audit table.
If you still need a workaround for the empty stream, you can explicitly set numPartitions on the rate source and keep it fixed for the lifetime of the checkpoint :
df = (
spark.readStream
.format("rate")
.option("numPartitions", "1")
.option("rowsPerSecond", "1")
.load()
)
df = df.select(
*[F.lit(None).cast(coltype).alias(colname) for colname, coltype in tb_schema]
).where("false")
return df
But if the existing checkpoint already stored numPartitions = 0 you should either set it to the value shown in the error message or do a full refresg on the pipeline before changing it to a new stable value.
If this answer resolves your question, could you please mark it as “Accept as Solution”? It will help other users quickly find the correct fix.
Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP