cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How can we save a data frame in Docx format using pyspark?

rammy
Contributor III

  

I am trying to save a data frame into a document but it returns saying that the below error

java.lang.ClassNotFoundException: Failed to find data source: docx. Please find packages at http://spark.apache.org/third-party-projects.htm  

           #f_data is my dataframe with data
           f_data.write.format("docx").save("dbfs:/FileStore/test/test.csv")
           display(f_data)
 

Note that i could save files of CSV, text and JSON format but is there any way to save a docx file using pyspark?

2 REPLIES 2

Harun
Honored Contributor

Hi @Ramesh Bathiniโ€‹ 

Only the below file formats are supported

  • text
  • csv
  • ldap
  • json
  • parquet
  • orc

Source Code for DataframeWriter:

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWr...

jose_gonzalez
Databricks Employee
Databricks Employee

Hi,

You cannot do it from Pyspark, but you can try to use Pandas to save to Excell. There is no Docx

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group