Using Libreoffice in Databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-27-2023 12:22 AM
Hi Community,
I'm using Databricks E2, and need to convert pptx files to pdf files.
This can be done in either a python or an R notebook using #Libreoffice
To achieve this I'd have to download LibreOffice; I'm not too sure on how to do that. Would I have to download on the cluster I'm using? if yes, then I tried uploading LibreOffice jar file from the compute tab, i can see it under the /dbfs/FileStore/jars directory., but when I try to use the soffice command in a %sh cell, it does not recognize it
Could someone guide me on how to achieve this.
- Labels:
-
Delta Lake
-
Spark
-
Workflows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-27-2023 02:59 AM
I suppose by Libreoffice you mean the sdk, without the frontend?
You will have to install the jar as a library on the compute cluster.
From that moment on, you can use the classes in your code.
If you cannot run the jar from a command line, it might be because it is not an executable.
Besides that: Databricks might not be the correct tool for such a scenario.

