cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

lambarc
by New Contributor II
  • 10343 Views
  • 7 replies
  • 13 kudos

How to read file in pyspark with “]|[” delimiter

The data looks like this: pageId]|[page]|[Position]|[sysId]|[carId 0005]|[bmw]|[south]|[AD6]|[OP4 There are atleast 50 columns and millions of rows. I did try to use below code to read: dff = sqlContext.read.format("com.databricks.spark.csv").option...

  • 10343 Views
  • 7 replies
  • 13 kudos
Latest Reply
rohit199912
New Contributor II
  • 13 kudos

you might also try the blow option.1). Use a different file format: You can try using a different file format that supports multi-character delimiters, such as text JSON.2). Use a custom Row class: You can write a custom Row class to parse the multi-...

  • 13 kudos
6 More Replies
User16789201666
by Contributor II
  • 1059 Views
  • 2 replies
  • 0 kudos

What are some guidelines for migrating to DBR 7/Spark 3?

What are some guidelines for migrating to DBR 7/Spark 3?

  • 1059 Views
  • 2 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

Please refer to the below reference for switching to DBR 7.xWe have extended our DBR 6.4 support until December 2021, DBR 6.4 extended support - Release notes: https://docs.databricks.com/release-notes/runtime/6.4x.htmlMigration guide to DBR 7.x: htt...

  • 0 kudos
1 More Replies
senthilkumar
by New Contributor
  • 15144 Views
  • 1 replies
  • 0 kudos

How filter condition working in spark dataframe?

I have a table in hbase with 1 billions records.I want to filter the records based on certain condition (by date). For example: Dataframe.filter(col(date) === todayDate) Filter will be applied after all records from the table will be loaded into me...

  • 15144 Views
  • 1 replies
  • 0 kudos
Latest Reply
muk1
New Contributor II
  • 0 kudos

Hello @senthil kumar​ To pass external values to the filter (or where) transformations you can use the "lit" function in the following way:Dataframe.filter(col(date) == lit(todayDate))don´t know if that helps. Be careful with the schema infered by th...

  • 0 kudos
Labels