cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Using Libreoffice in Databricks

dzm
New Contributor

Hi Community, 

I'm using Databricks E2, and need to convert pptx files to pdf files.

This can be done in either a python or an R notebook using #Libreoffice

To achieve this I'd have to download LibreOffice; I'm not too sure on how to do that. Would I have to download on the cluster I'm using? if yes, then I tried uploading LibreOffice jar file from the compute tab, i can see it under the /dbfs/FileStore/jars directory., but when I try to use the soffice command in a %sh cell, it does not recognize it

Could someone guide me on how to achieve this. 

1 REPLY 1

-werners-
Esteemed Contributor III

I suppose by Libreoffice you mean the sdk, without the frontend?

You will have to install the jar as a library on the compute cluster.
From that moment on, you can use the classes in your code.
If you cannot run the jar from a command line, it might be because it is not an executable.

Besides that: Databricks might not be the correct tool for such a scenario.