JongKim
New Contributor III

I also had the same problem, and here's how to make it work using column type and varargs:

// make example dataframe import org.apache.spark.sql.DataFrame val df: DataFrame = sc.parallelize(Seq((1, 2, 3), (4, 5, 6), (7, 8, 9))).toDF("a", "b", "c")

// desired list of column names in string (making it possible programmatically) val column_names_str = Seq[String]("a", "b")

// construct list of column names in column type import org.apache.spark.sql.functions.col val column_names_col = column_names_str.map(name => col(name)) //val column_names_col = column_names_str.map(name => col(name).as(s"renamed_$name")) // rename if needed

// select specific columns from dataframe using varargs syntax * val df_new = df.select(column_names_col: *) df_new.show()

This should yield as expected:

+---+---+
|  a|  b|
+---+---+
|  1|  2|
|  4|  5|
|  7|  8|
+---+---+

View solution in original post