cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Structured Streaming schemaTrackingLocation does not work with starting_version

Volker
New Contributor III

Hello Community,

I came across a strange behviour when using structured streaming on top of a delta table. 

I have a stream that I wanted to start from a specific version  of a delta table using the option option(

"starting_version", x) because I did not want to stream all the data of source the table but only the newly arriving one. To accomodate future (non-additive) schema changes I also set the option option("schemaTrackingLocation", checkpoint_location). 
Now, if I change the schema of the source table the DataStreamReader does not pick up the schema changes and writes these to the schemaTrackingLocation but still infers the old schema and I can't get it to pick up the schema changes.
After some trial and error I found out that the starting_version is probably the cause of the issue since I tried changing the schema on a stream without setting the starting_version option and it worked as intended and could pick up the schema changes on the source table.
I'm a bit confused since the starting_version should only have an effect when starting the stream and otherwise be ignored, as from the docs: 
They take effect only when starting a new streaming query. If a streaming query has started and the progress has been recorded in its checkpoint, these options are ignored. https://docs.databricks.com/en/structured-streaming/delta-lake.html#specify-initial-position
Did anybody have a similar problem? Is this an intended behaviour? How can I solve this issue? Where could I raise this issue?  
0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group