cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Ingesting geospatial data into a table

cpd
New Contributor II

I'm just getting started with Databricks and wondering if it is possible to ingest a GeoJSON or GeoParquet file into a new table without writing code? My goal here is to load vector data into a table and perform H3 polyfill operations on all the vector geometries in the table. In principle, it seems like this should all be possible with SQL commands on the platform, once I'm past the table creation step. Any pointers to get me oriented would be greatly appreciated!

Chris

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @cpd , 

  • First, create a new table in Databricks. You can do this using SQL commands or by using the Databricks UI.
  • CREATE TABLE my_geospatial_table
    USING <file_format>
    OPTIONS (
      'path' '<path_to_your_geojson_or_geoparquet_file>'
    )
    
    • Once the table is created, Databricks will automatically ingest the data from your GeoJSON or GeoParquet file.If you prefer SQL, you can execute the following command to create a new table:
    • You won’t need to write additional code for this step; Databricks handles it seamlessly.
    • Now that your data is in the table, you can use SQL commands to perform H3 polyfill operations on the vector geometries.
    • For example, if you want to create an H3 index for each geometry, you can use the ST_H3 function (assuming you’re using Databricks’ spatial functions):
    • SELECT
          ST_H3(ST_Point(x, y), resolution) AS h3_index,
          other_columns
      FROM my_geospatial_table
      
    • Databricks also provides visualization tools (such as Folium maps) to visualize geospatial data dire....

View solution in original post

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @cpd , 

  • First, create a new table in Databricks. You can do this using SQL commands or by using the Databricks UI.
  • CREATE TABLE my_geospatial_table
    USING <file_format>
    OPTIONS (
      'path' '<path_to_your_geojson_or_geoparquet_file>'
    )
    
    • Once the table is created, Databricks will automatically ingest the data from your GeoJSON or GeoParquet file.If you prefer SQL, you can execute the following command to create a new table:
    • You won’t need to write additional code for this step; Databricks handles it seamlessly.
    • Now that your data is in the table, you can use SQL commands to perform H3 polyfill operations on the vector geometries.
    • For example, if you want to create an H3 index for each geometry, you can use the ST_H3 function (assuming you’re using Databricks’ spatial functions):
    • SELECT
          ST_H3(ST_Point(x, y), resolution) AS h3_index,
          other_columns
      FROM my_geospatial_table
      
    • Databricks also provides visualization tools (such as Folium maps) to visualize geospatial data dire....

cpd
New Contributor II

Thank you @Kaniz - much appreciated!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.