Databricks Community

User16826994223 · ‎06-25-2021

Is there a way to keep my synapse database always in sync with latest data from delta table, My synapse database I believe doesn't support the stream as sink, can i get any workaround

User16826994223 · ‎06-25-2021

You could try to keep the data in sync by appending the new data dataframe in a forEachBatch on your write stream, this method allows for arbitrary ways to write data, you can connect to the Datawarehouse with jdbc if necessary:with your batch function being something like:

df = spark.readStream\
          .format('delta')\
          .load(input_path)
 
df_write = df.writeStream \
            .format("delta") \
            .foreachBatch(batch_write_jdbc) \
            .option("checkpointLocation", chekpoint) \
            .start("noop")\

Noop is dummy operation of write which will not actually write but starte the stream process which call the batch function that writes using jdbc

with your batch function being something like:

def batch_write_jdbc (df, batchId):
  
    df = df.anytransformation
    df.write.jdbc(jdbc_url, table=schema_name + "." + table_name, mode="append", properties=connection_properties)

Databricks Community

Delta Table to Spark Streaming to Synapse Table in azure databricks

Photos

Connect with Databricks Users in Your Area

Virtual Learning Festival: 9 April - 30 April

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Data + AI Summit 2025 — registration now open!

Databricks DevConnect: Global Community Meetups for Data Engineers

Databricks Community Champion - February 2025 - Stefan Koch