cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Table to Spark Streaming to Synapse Table in azure databricks

User16826994223
Honored Contributor III

Is there a way to keep my synapse database always in sync with latest data from delta table, My synapse database I believe doesn't support the stream as sink, can i get any workaround

1 REPLY 1

User16826994223
Honored Contributor III

You could try to keep the data in sync by appending the new data dataframe in a forEachBatch on your write stream, this method allows for arbitrary ways to write data, you can connect to the Datawarehouse with jdbc if necessary:with your batch function being something like:

df = spark.readStream\
          .format('delta')\
          .load(input_path)
 
df_write = df.writeStream \
            .format("delta") \
            .foreachBatch(batch_write_jdbc) \
            .option("checkpointLocation", chekpoint) \
            .start("noop")\

Noop is dummy operation of write which will not actually write but starte the stream process which call the batch function that writes using jdbc

with your batch function being something like:

def batch_write_jdbc (df, batchId):
  
    df = df.anytransformation
    df.write.jdbc(jdbc_url, table=schema_name + "." + table_name, mode="append", properties=connection_properties)

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group