Re: Want to load a high volume of CSV rows in the ...

daniel_sahal · ‎06-05-2023

@Michael Popp

In my opinion, the best way would be to split the file to some partitions (you need to find the best-fit column) and to ingest them using Autoloader with trigger=AvailableNow (batching) and writing to the same partition as the file is partitioned.

It will allow to achieve both - parallelism and avoid data skew.

View solution in original post