Hi @jeremy98 , collect() operation brings data to the driver and yes it can cause the memory issues that you are seeing, which can cause the cluster to be hung/ crash as well if done enough times. You may confirm these instances from the cluster even...
Hi @bsr , there was some internal discussion on this going on and I got to know that these DEBUG thread-dump lines from the ThreadMonitor started leaking to stderr/job output due to a Python logger misconfiguration introduced in the 17.3.3 branch. Th...
Hi @deane ,
I worked on a similar issue months earlier and the limit of 1024 is not configurable (no explicit way to increase the filter count), so my suggestion back then was to perform a two step process like filter for half of the filters once and...
Hi @CHorton The Databricks SQL engine does not support positional (?) parameters inside SQL UDF calls.
When Spark SQL parses GetCustomerData(?), the parameter is unresolved at analysis time, so you get [UNBOUND_SQL_PARAMETER]. This is not an ODBC bu...
Hi @Mukul
Databricks’ current direction is to use Unity Catalog as an open Iceberg catalog. UC exposes tables via the Iceberg REST Catalog API, so external engines (Spark, Flink, Trino, Snowflake, PyIceberg, etc.) can read and write UC-managed Icebe...