sean_owen
Databricks Employee
Databricks Employee

Yeah, it's more a design choice. Rather than have every implementation take column(s) params, this is handled once in VectorAssembler for all of them. One way or the other, most implementations need a vector of inputs anyway. VectorAssembler can do some optimizations to use sparse vectors too where applicable.