โ04-12-2023 08:47 AM
โ04-15-2023 05:57 PM
@Grainne Farrantโ :
It is possible to connect QGIS to Databricks using ODBC, but it requires additional configuration. Here are the general steps to follow:
Note that some users have reported issues with using ODBC drivers with QGIS on MacOS, so you may need to use a different approach if you are on that platform.
โ04-18-2023 12:36 AM
Hello,
I'm on Windows. I have downloaded the driver, set-up the DSN for Databricks and tested it - all works fine. I have used this DSN to connect to Databricks from ArcGIS.
In QGIS, I followed your instructions and when I click test connections, I get an error - Connection failed- check settings and try again. Extended error information: Unable to initialize ODBC connection to DSN for <username>/<password>@<DSN>
I'm using QGIS 3.28.5-Firenze.
โ12-17-2024 05:42 PM
I have been able to connect to Databricks in QGIS using the ODBC connection. An extra step not listed above that I had to do was to generate a personal access token (Databricks user Settings > Developer > Access tokens > Manage > Generate new token).
When setting up the DSN in step 2 above, I used the "User Name and Password" Authentication mechansim, using "token" as the username and the token generated above as the password. SSL has to be enabled in the SSL Options, and the value for Thrift Transport set to HTTP with the HTTP path value (available from the Databricks cluster's Advanced Options JDBC/ODBC tab) set in the HTTP Options
Then, once in QGIS, I was able to go to Layer > Add Layer > Add Vector Layer, choose Database, type ODBC, create a new connection, with the value of the Data Source Name when creating the DSN in ODBC DSN Setup used as the value for Database in the Connection information, and for the Authentication configuration, add an authentication configuration, using the same username ("token") and password (personal access token value) from creating the DSN
This all worked for me, and I was successfully able to connect to the database and the tables within them ...
However, despite the tables I want having fields of "geometry" type, the connector doesn't seem to recognise these fields and be able to draw them in QGIS.
This is where I am currently stuck, if anyone is able to suggest what can be done to display the spatial features from the table stored in Databricks. The geometry features are valid, as I am able to visualise them using SedonaKepler in the local Databricks notebook. In QGIS, the geometry field values appear as "BLOB" in the attribute table and do not seem to be recognised as spatial features.
โ01-06-2025 11:27 AM
Hi!
Did you do something on the DB-side to be able to connect from QGIS?
When I test my connection in QGIS it works, but when I try to "add", there's an error message saying that "
invalid Datasource: ODBC:token/xxxxxxxxxxxxxxxxxxxxxxxx@sql_warehouse_analyst is not a valid datasource or recognize" (free translation from French ๐
Our storage is parquet. Do we need Delta tables?
Thanks!
โ01-06-2025 02:28 PM
I was loading in Delta tables, yes.
โ01-06-2025 02:36 PM
Maybe that's our problem. Thanks!
โ01-06-2025 02:42 PM
Good luck, and let me know if you can get the spatial features showing as well!
โ01-07-2025 08:39 AM
I've confirmed that we use Delta Table. For the geometry column, I was going to do a ST_GeomFromWKT or ... to create a Query Layer...
On Databricks, in the Query History, I have what is attached... It's not a query. Problem from QGIS? Could I ask which version you are using? I have 3.38.2-Grenoble
Thanks!
โ01-07-2025 10:47 AM
We needed to add our schema as Server Side Properties in the Advanced Options. We now see our tables in QGIS.
To view the geometry, you can use the Database Manager and choose "Virtual Layers/Project Layers" (I'm guessing, my QIGS is in French) and use 'st_geometryfromtext' to create a Query Layer.
I tried with points, but as long as your geometry is in WKT in Databricks, it should work!
โ01-07-2025 03:39 PM
With respect to adding your schema as Server Side Properties in the Advanced Options, could you please provide a screenshot so that I can see what the Keys and Values look like for your schema? Maybe this is what I am missing??
I am using QGIS 3.34 (long term release version)
My Delta table in Databricks had the spatial features as a geometry type in its schema, but in QGIS it did not recognise it as geometry, with the attribute table not showing any spatial features, with "BLOB" appearing as the value for all rows in the geometry field.
What I managed to do, however, was to add a field to the Delta table of the geometry WKT and store the field as a string type. When QGIS reads in this table, I then created a virtual layer, with the query on the Delta table running a ST_GeomFromText transformation on the geometry string type field, and it displayed the spatial features that way.
While this is a workaround, I would like to think it is possible to store the geometry features in the Delta table as a geometry type, and for QGIS to recognise that field as geometry.
Please let me know about how the schema was added to the Server Side Properties ... if that doesn't work, I may have to stick with keeping the geometry as a string format in the Delta Table
Merci! ๐
โ01-08-2025 06:32 AM
The key is "databricks.catalog", but it didn't solved my problem with the geometry either. It only works for WKT (but, don't quote me on this ๐ I heard that Databricks may be working on a geometry type that could be native (we have geometry as a UDT and use Sedona).
Also, I have probably 30 tables in my schema (not that much) with millions of row (again, not billions) and yesterday it analyzed the schema for hours (the objects being greyed and unselectionable) before being able to add them to QGIS.
โ01-08-2025 02:02 PM
Ah, maybe that's what the problem is. With respect to the large number of tables, I have experienced the same problem. I have tried changing the ODBC connection's HTTP path so that it connected to a Serverless Warehouse, but I haven't been able to work that out yet. Also, while I am still trying to work out how to only connect to a single named schema (I haven't been able to figure out a wildcard), I can connect to a single Delta table, which is a much quicker process. In the Database field, I added the name of the table after the name of the ODBC connection.
It only lists that one table, but it does make the process much quicker. Hopefully that is of some help. If you make progress in other areas, I'll be happy to hear from you!
โ01-09-2025 06:29 AM
That should probably help, but I tried and my table has 0 rows when the same table loaded with all the schema analyzed has 835...
Still have testing to do, but with that, you can now choose a single file to add using the connection string
ODBC:token/yourpersonalprivatetoken@yourodbcconnection,yourtable
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group