Hi there,
I’m trying to join a small table (a few million records) with a much larger table (around 1 TB in size, containing a few billion records).
The small table isn’t quite small enough to use Broadcast. Additionally, our join clause involves more than four columns. I attempted to enable Liquid Clustering on the large table, but it only supports up to four columns. I experimented with different combinations of four-column sets for Liquid Clustering, but none of them reduced the join time.
Do you have any recommendations for optimizing a query on a table with Liquid Clustering when the join criteria involve more than four columns?