said differently/functionally, mapPartitions() returns a value and does not have side effects . forEachPartition() does not return a value, but (typically) does have side effects.
@schnee ha! nah, this is an unnecessarily verbose and complex way of doing a fairly common transformation. it would be nice to have a df.explodeArray() method.anyway, what type of exceptions are you seeing?
@samalexg : are you sure you're writing to the /etc/environment file as follows:otherwise, the env vars are only set for the process that is called to run the script.i assume you're doing this, but wanted to double check as this has been a common mi...
Sorted DataIf your data is sorted using either sort() or ORDER BY, these operations will be deterministic and return either the 1st element using first()/head() or the top-n using head(n)/take(n).show()/show(n) return Unit (void) and will print up to...