cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Ingesting geospatial data into a table

cpd
New Contributor II

I'm just getting started with Databricks and wondering if it is possible to ingest a GeoJSON or GeoParquet file into a new table without writing code? My goal here is to load vector data into a table and perform H3 polyfill operations on all the vector geometries in the table. In principle, it seems like this should all be possible with SQL commands on the platform, once I'm past the table creation step. Any pointers to get me oriented would be greatly appreciated!

Chris

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @cpd , 

  • First, create a new table in Databricks. You can do this using SQL commands or by using the Databricks UI.
  • CREATE TABLE my_geospatial_table
    USING <file_format>
    OPTIONS (
      'path' '<path_to_your_geojson_or_geoparquet_file>'
    )
    
    • Once the table is created, Databricks will automatically ingest the data from your GeoJSON or GeoParquet file.If you prefer SQL, you can execute the following command to create a new table:
    • You won’t need to write additional code for this step; Databricks handles it seamlessly.
    • Now that your data is in the table, you can use SQL commands to perform H3 polyfill operations on the vector geometries.
    • For example, if you want to create an H3 index for each geometry, you can use the ST_H3 function (assuming you’re using Databricks’ spatial functions):
    • SELECT
          ST_H3(ST_Point(x, y), resolution) AS h3_index,
          other_columns
      FROM my_geospatial_table
      
    • Databricks also provides visualization tools (such as Folium maps) to visualize geospatial data dire....

View solution in original post

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @cpd , 

  • First, create a new table in Databricks. You can do this using SQL commands or by using the Databricks UI.
  • CREATE TABLE my_geospatial_table
    USING <file_format>
    OPTIONS (
      'path' '<path_to_your_geojson_or_geoparquet_file>'
    )
    
    • Once the table is created, Databricks will automatically ingest the data from your GeoJSON or GeoParquet file.If you prefer SQL, you can execute the following command to create a new table:
    • You won’t need to write additional code for this step; Databricks handles it seamlessly.
    • Now that your data is in the table, you can use SQL commands to perform H3 polyfill operations on the vector geometries.
    • For example, if you want to create an H3 index for each geometry, you can use the ST_H3 function (assuming you’re using Databricks’ spatial functions):
    • SELECT
          ST_H3(ST_Point(x, y), resolution) AS h3_index,
          other_columns
      FROM my_geospatial_table
      
    • Databricks also provides visualization tools (such as Folium maps) to visualize geospatial data dire....

cpd
New Contributor II

Thank you @Kaniz_Fatma - much appreciated!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group