Using Libreoffice in Databricks

dzm
Databricks Partner

Hi Community, 

I'm using Databricks E2, and need to convert pptx files to pdf files.

This can be done in either a python or an R notebook using #Libreoffice

To achieve this I'd have to download LibreOffice; I'm not too sure on how to do that. Would I have to download on the cluster I'm using? if yes, then I tried uploading LibreOffice jar file from the compute tab, i can see it under the /dbfs/FileStore/jars directory., but when I try to use the soffice command in a %sh cell, it does not recognize it

Could someone guide me on how to achieve this. 

-werners-
Esteemed Contributor III

I suppose by Libreoffice you mean the sdk, without the frontend?

You will have to install the jar as a library on the compute cluster.
From that moment on, you can use the classes in your code.
If you cannot run the jar from a command line, it might be because it is not an executable.

Besides that: Databricks might not be the correct tool for such a scenario.