I am new to using `Mosaic` on Databricks. The task is to create a Heatmap with the counts within Hexagons of H3 resolution. Since the dataset is quite large I am opting for Mosaic. As the first step I am trying to create the hexagons covering the given area but the following code does not generate hexagons covering the full square as illustrated below:
from pyspark.sql.functions import *
from mosaic import enable_mosaic
enable_mosaic(spark, dbutils)
from pyspark import SparkContext
from pyspark.sql import functions as F
import mosaic as mos
from mosaic import st_point
lons = [-80., -80., -70., -70., -80.]
lats = [ 35., 45., 45., 35., 35.]
bounds_df = (
spark
.createDataFrame({"lon": lon, "lat": lat} for lon, lat in zip(lons, lats))
.coalesce(1)
.withColumn("point_geom", st_point("lon", "lat"))
)
bounds_df.show()
from mosaic import st_makeline
bounds_df = (
bounds_df
.groupBy()
.agg(F.collect_list("point_geom").alias("bounding_coords"))
.select(st_makeline("bounding_coords").alias("bounding_ring"))
)
bounds_df.show()
from mosaic import st_makepolygon
bounds_df = bounds_df.select(st_makepolygon("bounding_ring").alias("bounds"))
bounds_df.show()
hexs = (bounds_df
.select(mos.mosaic_explode("bounds", lit(5)))
.select("index.*")
)
hexs.show()
%%mosaic_kepler
hexs "index_id" "h3"
Do you know why there are gaps between the hexagons?