Not able to parse .doc extension file using scala in databricks notebook?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-21-2022 10:41 PM
I could able to parse .doc extension files using Java programming with the help of POI libraries but when trying to convert Java code into Scala i expect it has to work with same java libraries with Scala programming but it is showing with below error even after adding respective Java library
Appreciate any help in unblocking me to proceed forward.
My requirement is to parse .doc extension files so any other alternative also should be fine with me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-03-2022 12:57 AM
Hi @Ramesh Bathiniโ
In pyspark, we have a docx module. I found that to be working perfectly fine. Can you try using that ?
Documentation and stuff could be found online.
Cheers...

