dave_d
New Contributor II

Interesting, thanks for the response! Why is it here that the Columnar to Row node happens after all of the query logic, and right before the query results stage? I would assume that if it's needed for a particular operation that only supports row-based processing, then it would happen earlier in the plan.

Also, do you know if there a list somewhere in the Databricks or Spark SQL docs that specifies which types of Spark SQL expressions/operations support columnar execution? Would love to know if I can rewrite this query somehow to not require that transition, given how expensive it appears to be here.