cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Rani
by New Contributor
  • 7199 Views
  • 2 replies
  • 0 kudos

Divide a dataframe into multiple smaller dataframes based on values in multiple columns in Scala

I have to divide a dataframe into multiple smaller dataframes based on values in columns like - gender and state , the end goal is to pick up random samples from each dataframeI am trying to implement a sample as explained below, I am quite new to th...

  • 7199 Views
  • 2 replies
  • 0 kudos
Latest Reply
subham0611
New Contributor II
  • 0 kudos

@raela I also have similar usecase. I am writing data to different databricks tables based on colum value.But I am getting insufficient disk space error and driver is getting killed. I am suspecting df.select(colName).distinct().collect()step is taki...

  • 0 kudos
1 More Replies
Kaniz
by Community Manager
  • 626 Views
  • 1 replies
  • 1 kudos
  • 626 Views
  • 1 replies
  • 1 kudos
Latest Reply
saipujari_spark
Valued Contributor
  • 1 kudos

Yes, we can concat() and concat_ws() inbuilt functions.concat - usage> SELECT concat('Spark', 'SQL'); SparkSQLconcat_ws - usage - concatenates with a separator SELECT concat_ws(' -', 'Spark', 'SQL'); Spark-SQLReference: https://spark.apache.org/do...

  • 1 kudos
Labels