Hi! Databricks recently released the documentation on using Unity Catalog with Structured Streaming: https://docs.databricks.com/structured-streaming/unity-catalog.htmlPer document requirement, for both interactive notebooks and scheduled jobs, you m...
Hi! For low latency queries, it'll be great to break this down into two parts: query serving latency, and data freshness latency. Serving the data with DLT can probably get streams in 1 sec intervals, and once that's committed to delta, it's immediat...
Hi! General speaking, it's good practice to avoid collect() action unless you absolutely need to, because collect() is action operation that will retrieve all the elements of the RDD/DataFrame/Dataset from all nodes to the driver node. If the datase...