@ranged_coop In addition to my previous message, checkpointing is not a Databricks behavior as you said, checkpointing is part of open source Spark structured streaming.
@dmytro, Autoscaling is managed by Databricks and it's logic is mostly automatic. But If you're planning on structured streaming for production I suggest you to go for a fixed amount of workers and limiting your streaming query input rate or create a...
@dmytro yes, it's possible to monitor the consumer lag through the streaming query metrics. Every cluster that runs a spark structured streaming query will log the metrics for each streaming batch in the driver logs and Spark UI. More details at Moni...
@ranged_coop Regarding your questions:
Is there any setting that needs to be enable to fix this?There is no setting to change this behavior, as it is a design decision and not an issue. Looks like you're referring to checkpointing. These are the docs...
Hi @camilo_s ,
Spark SQL is the SQL API for Spark applications, while Databricks SQL is a product that follows data warehouse principles. You can anticipate performance differences mainly due to the fact that Databricks SQL compute is based on SQ...