SaugatMukherjee
New Contributor III

Hi,

Iceberg streaming is possible in Databricks. One does not need to change to Delta Lake. In my previous attempt, I used "load" while reading the source iceberg table. One should instead use "table". Load apparently seems to take a path and not a table name.

This is correct code to stream from Iceberg table.

from pyspark.sql.functions import current_timestamp

source_stream = (spark.readStream 
    .format("iceberg") 
    .table("engagement.`sandbox-client-feedback`.dummy_iceberg_source_stream")
    .withColumn("_ProcessedTime", current_timestamp())
)

query = (source_stream.writeStream 
    .format("iceberg") 
    .outputMode("append") 
    .trigger(availableNow=True) 
    .option("checkpointLocation", "/tmp/checkpoint/testicebergdestination") 
    .toTable("engagement.`sandbox-client-feedback`.dummy_iceberg_destination")
)
query.awaitTermination()

View solution in original post