Databricks Community

giohappy · ‎02-06-2023

The usual way to use Apache Sedona inside pySpark is by first registering Sedona types and functions with

SedonaRegistrator.registerAll(spark)

We need to have these autoregistered when the cluster start (to be able, for example, to perform geospatial queries with the Databricks SQL Connector for Python).

From my understanding, autoregistration can be obtained by adding the following cluster configuration, but it doesn't work:

spark.sql.extensions org.apache.sedona.sql.SedonaSqlExtensions

Do I miss something?

Is my expectation wrong?

Anonymous · ‎04-09-2023

@Giovanni Allegri :

The configuration you have provided is for registering the Sedona SQL extensions with Spark SQL. However, to register Sedona types and functions with PySpark, you need to use a different configuration.

You can add the following configuration to the Spark cluster configuration to enable automatic registration of Sedona types and functions with PySpark:

spark.extraListeners org.apache.sedona.core.serde.SedonaSQLRegistrator

This will enable automatic registration of Sedona types and functions when a PySpark session is created. Alternatively, you can also register Sedona types and functions explicitly in your PySpark code using the SedonaRegistrator.registerAll(spark) method. However, this would require you to call this method every time you create a new PySpark session.

I hope this helps!

View solution in original post

Anonymous · ‎04-09-2023

@Giovanni Allegri :

The configuration you have provided is for registering the Sedona SQL extensions with Spark SQL. However, to register Sedona types and functions with PySpark, you need to use a different configuration.

You can add the following configuration to the Spark cluster configuration to enable automatic registration of Sedona types and functions with PySpark:

spark.extraListeners org.apache.sedona.core.serde.SedonaSQLRegistrator

This will enable automatic registration of Sedona types and functions when a PySpark session is created. Alternatively, you can also register Sedona types and functions explicitly in your PySpark code using the SedonaRegistrator.registerAll(spark) method. However, this would require you to call this method every time you create a new PySpark session.

I hope this helps!

Kunal_Mishra · ‎10-03-2024

Hi , After adding the suggested config, i am getting the following error

Caused by: java.lang.ClassNotFoundException: org.apache.sedona.core.serde.SedonaSQLRegistrator not found in com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader@7d1cdebf

What should i do to fix this?

Anonymous · ‎04-12-2023

Hi @Giovanni Allegri

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!

Databricks Community

SedonaSqlExtensions is not autoregistering types and functions

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!