Geometry Type not converted into proper binary format when reading through Federated Catalog

shreya24
New Contributor II

Hi,

When reading a geometry column from a sql server into Databricks through foreign/federated catalog the tranformation of geometry type to binary type is not in proper format or I am not able to find a way I can decode that binary.

for example, for point, POINT (40.1150970058931 -8.3855249364345 30.3999996185303)
I am getting the value as 5hAAAAENrI5DhmPFIMDVFqp/uw5EQAcAAGBmZj5A whereas I am expecting AekDAADVFqp/uw5EQKyOQ4ZjxSDABwAAYGZmPkA=

The value AekDAADVFqp/uw5EQKyOQ4ZjxSDABwAAYGZmPkA=, I got when I read it using jdbc connection and querying the table as Shape.AsBinaryZM(), but in federated catalog I cannot define the query format or define a specific query.

Is there any way I can get proper binary into the geometry column

Thanks

AbhaySingh
Databricks Employee
Databricks Employee

Give this a shot

Create a view in SQL Server that converts geometry to Well-Known Text before federating:

-- Create view in SQL Server
CREATE VIEW dbo.vw_spatial_converted AS
SELECT
id,
location_name,
location.STAsText() AS geom_wkt,
location.STSrid() AS srid,
location.AsBinaryZM() AS geom_wkb_zm, -- If you need Z/M coordinates
other_columns
FROM dbo.locations;

Then federate the view instead of the base table and convert in Databricks:

# In Databricks (requires DBR 17.1+ for native GEOMETRY support)
from pyspark.sql.functions import expr

df = spark.table("foreign_catalog.sqlserver_db.vw_spatial_converted")

# Convert WKT to native GEOMETRY type
df_spatial = df.withColumn(
"geometry",
expr("ST_GeomFromWKT(geom_wkt, CAST(srid AS INT))")
)

# Now you can use Databricks spatial functions
result = df_spatial.where(
expr("ST_Within(geometry, ST_GeomFromWKT('POLYGON((...))', 4326))")
)

View solution in original post