cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Mosaic's grid_boundary method returns inconsistent geometries

kll
New Contributor III

I am applying mosaic's `grid_boundary` method on a spark DataFrame containing a set of `h3_hex_ids`. The geometries returned are not consistent. i.e they could be either `lat, long` or `long, lat`.


Here's a sample data

```
import pyspark.sql.functions as F
import h3

# Create a Spark DataFrame with a h3_hex_id column
df = spark.createDataFrame([
("point1", 612232698081050623),
("point2", 611987238200279039),
("point3", 612252103481491455)),
], ["name", "h3_hex_id"])


# Get Polygon geometries
df1 = df.withColumn("geometry", grid_boundary(col("h3_hex_id"), format_name=F.lit("WKT")))
df1.show()

+--------------------+------+----------------+----------------+
| name | h3_hex_id | | geometry |
+--------------------+------+----------------+----------------+
| point1 | 612232698081050623 | POLYGON ((42.1261...
| point2 | 611987238200279039 | POLYGON ((-145.83...
| point3 | 612252103481491455 | POLYGON ((-138.08...
+--------------------+------+----------------+----------------+

```

This is causing a few issues with downstream tasks like spatial joins. Why are the geometries returned by `grid_boundary` method not consistent?

0 REPLIES 0
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.