10-30-2021 10:20 AM
Is there a way to use sql desktop tools? because delta OSS or databricks does not provide desktop client (similar to azure data studio) to browse and query delta lake objects.
I currently use databricks SQL , a webUI in the databricks workspace but searching for means beyond that, especially for end business users who are familiar with SSMS type clients.
@Bilal Aslam
11-01-2021 11:42 AM
@prasad vaze I don't use SSMS (not a Windows user), and I'm not even sure if it will work with Databricks. However, I did want to list the steps to get DBvisualizer to work:
10-31-2021 12:39 AM
Hi @prasad vaze - definitely, you can use dbeaver or DbVisualizer to do this. Download the drivers first and connect with one of these tools.
02-06-2022 08:33 AM
Hi Bilal, Old post, but I am trying to do the same, but with Databricks in Azure.
I downloaded the JDBC Driver following instructions here : https://docs.databricks.com/dev-tools/dbeaver.html
I have my Databricks cluster running, but when I then try to connect from Dbeaver, I get this error :
[Simba][SparkJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP Response code: 401, Error message: Unknown.
I have Googled around and cannot seem to find an answer. I have Can Manage permissions on the cluster, I did generate the Personal Access Token and it is part of the URL with userid set to token
Read through this page on MS site and I seem to have done everything, yet it does not work
Databricks ODBC and JDBC drivers - Azure Databricks | Microsoft Docs
@Bill Davis
02-16-2022 02:53 AM
@Navin Ladda apologies for the late reply. I went through the end to end process and verified that I'm indeed able to connect from dbeaver to Azure Databricks. Did you read the docs and exactly follow the steps? One common mistake is to not enter the JDBC URL correctly
02-24-2022 06:28 AM
I was able to resolve this. I did have an error in the JDBC URL where I still have < > surrounding the token value. Once I removed it, was able to connect Dbeaver to Databricks cluster.
The method to have it setup via Active directory was head spinning for me to follow along. But for now this time limited token method is fine to explore working with Databricks.
10-31-2021 10:41 PM
Thanks @Bilal Aslam I was able to connect via SQL Workbench/j by following this https://docs.databricks.com/integrations/bi/workbenchj.html and then also successfully connected from DBVisualizer
Trying to figure out how to connect via SSMS
11-01-2021 11:42 AM
@prasad vaze I don't use SSMS (not a Windows user), and I'm not even sure if it will work with Databricks. However, I did want to list the steps to get DBvisualizer to work:
02-17-2022 08:11 PM
I am able to connect via DBVisualizer and SQL workbench to delta lake. But our users don't want to use anything other than SSMS (sql server management studio) or Azure data studio. These are defacto clients for sql server users. It will be a great opportunity for delta lake (data bricks) to enable these tools for connection to delta lake ( many sql server users will convert to delta lake users)
02-25-2022 08:28 PM
@Bilal Aslam I still think databricks should provide a way to connect to delta tables from SSMS or azure data studio desktop clients. Because there are so many users of these tools , once delta tables are accessible from these tools then the delta lake will catch fire ( easy to talk users into switching from sql server/ azure synapse). Microsoft will not provide this obviously so its in databricks benefit to resolve this limitation. right now many business users in my company prefer to use sql db just because they can't query delta lake using their familiar client (SSMS / azure data studio)
03-01-2022 04:29 AM
@prasad vaze I totally agree. I'd love to support connecting from Azure Data Studio and even SSMS!
03-02-2022 08:58 PM
is databricks working on developing the client tool?
03-04-2022 04:19 AM
@prasad vaze we are working on integrating with lots of client tools.
03-05-2022 12:01 PM
@Bilal Aslam Great , I am looking forward to SSMS or Azure data studio integration. I understand that connecting to delta lake via SQL Endpoint or spark cluster is better performance experience than connecting via DSR (delta standalone reader library) . SSMS/Studio clients running on normal desktop/laptop don't have local spark running but they can always connect to delta lake via cluster/SQL Endpoint . If that happens then many of my users will start using delta lake instead of azure synapse
03-07-2022 11:03 AM
@prasad vaze Your comment made me think of our own environment. What is DSR ?
I have only briefly used Azure Data Studio for connecting to Azure SQL as I prefer Dbeaver. But your comment about using Delta Lake vs Azure Synapse...can you please elaborate as we plan to use Synapse as well. Do you use Azure Synapse as the final layer in your environment from where end users get their data/connect to Power BI etc ? Or are you envisioning being able to have these users connect directly to Delta Lake as Delta lake claims to give Relational database like Update/Delete capabilities with improved query performance ?
03-12-2022 11:38 AM
DSR is Delta Standalone Reader. see more here - https://docs.delta.io/latest/delta-standalone.html
Its a crate (and also now a py library) that allows you to connect to delta tables without using spark (e.g. directly from python and not using pyspark) so its good use case if you don't or can't have spark environment (e.g. user desktop or azure function app ) AND this is developer friendly but not business user friendly way to query delta lake
In our case , we "have to" use Synapse SQL DW as data consumption layer for end users / BI tools who are so much accustomed to traditional desktop client-to -sql server way of doing things. Its bit hard to move them from T-sql to spark sql but what has been harder is to move them from Sql server to delta lake because of the lack of direct connectivity from our most popular desktop clients (SSMS, azure data studio) to delta lake. Its not really direct connection because, there has to be a spark cluster between desktop client and backend delta lake . The cluster provides JDBC driver.
I see a potential in delta lake to become final data consumption layer someday. Performance and cost is better but user-friendliness its not there yet in my opinion due to above limitations and some other limitations (for e.g. Implementing row/column level security in delta lake is more involved than in RDBMS. Either you have to use OTS tool or databricks proprietary tool Unity Catalog which is still in private preview )
 
					
				
				
			
		
 
					
				
				
			
		
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now