- 10928 Views
- 7 replies
- 13 kudos
The data looks like this:
pageId]|[page]|[Position]|[sysId]|[carId 0005]|[bmw]|[south]|[AD6]|[OP4
There are atleast 50 columns and millions of rows.
I did try to use below code to read:
dff = sqlContext.read.format("com.databricks.spark.csv").option...
- 10928 Views
- 7 replies
- 13 kudos
Latest Reply
you might also try the blow option.1). Use a different file format: You can try using a different file format that supports multi-character delimiters, such as text JSON.2). Use a custom Row class: You can write a custom Row class to parse the multi-...
6 More Replies
- 1137 Views
- 2 replies
- 0 kudos
What are some guidelines for migrating to DBR 7/Spark 3?
- 1137 Views
- 2 replies
- 0 kudos
Latest Reply
Please refer to the below reference for switching to DBR 7.xWe have extended our DBR 6.4 support until December 2021, DBR 6.4 extended support - Release notes: https://docs.databricks.com/release-notes/runtime/6.4x.htmlMigration guide to DBR 7.x: htt...
1 More Replies
- 15799 Views
- 1 replies
- 0 kudos
I have a table in hbase with 1 billions records.I want to filter the records based on certain condition (by date).
For example:
Dataframe.filter(col(date) === todayDate)
Filter will be applied after all records from the table will be loaded into me...
- 15799 Views
- 1 replies
- 0 kudos
Latest Reply
Hello @senthil kumar To pass external values to the filter (or where) transformations you can use the "lit" function in the following way:Dataframe.filter(col(date) == lit(todayDate))don´t know if that helps. Be careful with the schema infered by th...