<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Save Spark DataFrame to shape file (.shp format) in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/save-spark-dataframe-to-shape-file-shp-format/m-p/10562#M5725</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I know how to create .shp file from Geopandas dataframe using code similar to this, also &lt;A href="https://stackoverflow.com/questions/73231194/how-to-save-dataframe-as-shp-geojson-in-pyspark-databricks" alt="https://stackoverflow.com/questions/73231194/how-to-save-dataframe-as-shp-geojson-in-pyspark-databricks" target="_blank"&gt;mentioned on SO&lt;/A&gt;:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;gpd_df = geopandas.GeoDataFrame(pandas_df, geometry='geom')
gpd_df .to_file("username/nh.shp")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;However I have .parquet files that I can load directly to Spark DataFrame and I want to create and save shape file this way. Unfortunately I'm not sure if that's possible. I can't see .shp format in &lt;A href="https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html" alt="https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html" target="_blank"&gt;supported formats.&lt;/A&gt; I checked also Sedona but found only &lt;A href="https://github.com/apache/sedona/blob/master/core/src/main/java/org/apache/sedona/core/formatMapper/shapefileParser/ShapefileReader.java" alt="https://github.com/apache/sedona/blob/master/core/src/main/java/org/apache/sedona/core/formatMapper/shapefileParser/ShapefileReader.java" target="_blank"&gt;Shapefilereader&lt;/A&gt; not allowing to save/write. What is the state-of-the-art to operate on shape files?&lt;/P&gt;</description>
    <pubDate>Fri, 27 Jan 2023 13:07:14 GMT</pubDate>
    <dc:creator>Bartek</dc:creator>
    <dc:date>2023-01-27T13:07:14Z</dc:date>
    <item>
      <title>Save Spark DataFrame to shape file (.shp format)</title>
      <link>https://community.databricks.com/t5/data-engineering/save-spark-dataframe-to-shape-file-shp-format/m-p/10562#M5725</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I know how to create .shp file from Geopandas dataframe using code similar to this, also &lt;A href="https://stackoverflow.com/questions/73231194/how-to-save-dataframe-as-shp-geojson-in-pyspark-databricks" alt="https://stackoverflow.com/questions/73231194/how-to-save-dataframe-as-shp-geojson-in-pyspark-databricks" target="_blank"&gt;mentioned on SO&lt;/A&gt;:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;gpd_df = geopandas.GeoDataFrame(pandas_df, geometry='geom')
gpd_df .to_file("username/nh.shp")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;However I have .parquet files that I can load directly to Spark DataFrame and I want to create and save shape file this way. Unfortunately I'm not sure if that's possible. I can't see .shp format in &lt;A href="https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html" alt="https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html" target="_blank"&gt;supported formats.&lt;/A&gt; I checked also Sedona but found only &lt;A href="https://github.com/apache/sedona/blob/master/core/src/main/java/org/apache/sedona/core/formatMapper/shapefileParser/ShapefileReader.java" alt="https://github.com/apache/sedona/blob/master/core/src/main/java/org/apache/sedona/core/formatMapper/shapefileParser/ShapefileReader.java" target="_blank"&gt;Shapefilereader&lt;/A&gt; not allowing to save/write. What is the state-of-the-art to operate on shape files?&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jan 2023 13:07:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/save-spark-dataframe-to-shape-file-shp-format/m-p/10562#M5725</guid>
      <dc:creator>Bartek</dc:creator>
      <dc:date>2023-01-27T13:07:14Z</dc:date>
    </item>
    <item>
      <title>Re: Save Spark DataFrame to shape file (.shp format)</title>
      <link>https://community.databricks.com/t5/data-engineering/save-spark-dataframe-to-shape-file-shp-format/m-p/10563#M5726</link>
      <description>&lt;P&gt;@Bartosz Maciejewski​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Spark does not have native support for writing Shapefiles directly. However, you can use a third-party library such as GeoPandas or PyShp to write your Spark DataFrame to a Shapefile.&lt;/P&gt;&lt;P&gt;Here's an example of how to use GeoPandas to convert a Spark DataFrame to a GeoDataFrame and save it to a Shapefile.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import geopandas as gpd
from pyspark.sql import SparkSession
from shapely.geometry import Point
&amp;nbsp;
# create SparkSession
spark = SparkSession.builder.appName("SparkGeoPandas").getOrCreate()
&amp;nbsp;
# create sample Spark DataFrame
df = spark.createDataFrame([(1, Point(0, 0)), (2, Point(1, 1))], ["id", "geometry"])
&amp;nbsp;
# convert Spark DataFrame to GeoDataFrame using GeoPandas
*** = gpd.GeoDataFrame(df.toPandas(), geometry="geometry")
&amp;nbsp;
# save GeoDataFrame to Shapefile
***.to_file("path/to/shapefile.shp", driver="ESRI Shapefile")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;You can also use another library 'PyShp' instead of GeoPandas.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Mar 2023 04:14:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/save-spark-dataframe-to-shape-file-shp-format/m-p/10563#M5726</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-03-09T04:14:39Z</dc:date>
    </item>
  </channel>
</rss>

