cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

SedonaSqlExtensions is not autoregistering types and functions

giohappy
New Contributor III

The usual way to use Apache Sedona inside pySpark is by first registering Sedona types and functions with

SedonaRegistrator.registerAll(spark)

We need to have these autoregistered when the cluster start (to be able, for example, to perform geospatial queries with the Databricks SQL Connector for Python).

From my understanding, autoregistration can be obtained by adding the following cluster configuration, but it doesn't work:

spark.sql.extensions org.apache.sedona.sql.SedonaSqlExtensions

Do I miss something?

Is my expectation wrong?

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Giovanni Allegri​ :

The configuration you have provided is for registering the Sedona SQL extensions with Spark SQL. However, to register Sedona types and functions with PySpark, you need to use a different configuration.

You can add the following configuration to the Spark cluster configuration to enable automatic registration of Sedona types and functions with PySpark:

spark.extraListeners org.apache.sedona.core.serde.SedonaSQLRegistrator

This will enable automatic registration of Sedona types and functions when a PySpark session is created. Alternatively, you can also register Sedona types and functions explicitly in your PySpark code using the SedonaRegistrator.registerAll(spark) method. However, this would require you to call this method every time you create a new PySpark session.

I hope this helps!

View solution in original post

3 REPLIES 3

Anonymous
Not applicable

@Giovanni Allegri​ :

The configuration you have provided is for registering the Sedona SQL extensions with Spark SQL. However, to register Sedona types and functions with PySpark, you need to use a different configuration.

You can add the following configuration to the Spark cluster configuration to enable automatic registration of Sedona types and functions with PySpark:

spark.extraListeners org.apache.sedona.core.serde.SedonaSQLRegistrator

This will enable automatic registration of Sedona types and functions when a PySpark session is created. Alternatively, you can also register Sedona types and functions explicitly in your PySpark code using the SedonaRegistrator.registerAll(spark) method. However, this would require you to call this method every time you create a new PySpark session.

I hope this helps!

Hi , After adding the suggested config, i am getting the following error 

Caused by: java.lang.ClassNotFoundException: org.apache.sedona.core.serde.SedonaSQLRegistrator not found in com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader@7d1cdebf

What should i do to fix this?

Anonymous
Not applicable

Hi @Giovanni Allegri​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group