Delta Table to Spark Streaming to Synapse Table in azure databricks

User16826994223 — Fri, 25 Jun 2021 16:15:10 GMT

Is there a way to keep my synapse database always in sync with latest data from delta table, My synapse database I believe doesn't support the stream as sink, can i get any workaround

Re: Delta Table to Spark Streaming to Synapse Table in azure databricks

User16826994223 — Fri, 25 Jun 2021 16:17:48 GMT

You could try to keep the data in sync by appending the new data dataframe in a forEachBatch on your write stream, this method allows for arbitrary ways to write data, you can connect to the Datawarehouse with jdbc if necessary:with your batch function being something like:

df = spark.readStream\
          .format('delta')\
          .load(input_path)
 
df_write = df.writeStream \
            .format("delta") \
            .foreachBatch(batch_write_jdbc) \
            .option("checkpointLocation", chekpoint) \
            .start("noop")\

Noop is dummy operation of write which will not actually write but starte the stream process which call the batch function that writes using jdbc

with your batch function being something like:

def batch_write_jdbc (df, batchId):
  
    df = df.anytransformation
    df.write.jdbc(jdbc_url, table=schema_name + "." + table_name, mode="append", properties=connection_properties)

topic Delta Table to Spark Streaming to Synapse Table in azure databricks in Data Engineering

Delta Table to Spark Streaming to Synapse Table in azure databricks

Re: Delta Table to Spark Streaming to Synapse Table in azure databricks