Data Engineering

Forum Posts

Sorted by:

by jonathan-dufaul • Valued Contributor

12-14-2022 9:50:16 AM

2426 Views
2 replies
1 kudos

How do I specify column types when writing to a MSSQL server using the JDBC driver (

I have a pyspark dataframe that I'm writing to an on-prem MSSQL server--it's a stopgap while we convert data warehousing jobs over to databricks. The processes that use those tables in the on-prem server rely on the tables maintaining the identical s...

Data Engineering

2426 Views
2 replies
1 kudos

12-14-2022 9:50:16 AM

View Replies

Latest Reply

dasanro
New Contributor II

11-08-2023 7:50:13 AM

1 kudos

It's happenging to me too!Did you find any solution @jonathan-dufaul ?Thanks!!

1 kudos

11-08-2023 7:50:13 AM

1 More Replies

by AB_MN • New Contributor III

01-27-2023 10:40:12 AM

7100 Views
4 replies
1 kudos

Resolved! Read data from Azure SQL DB

I am trying to read data into a dataframe from Azure SQL DB, using jdbc. Here is the code I am using.driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver" database_host = "server.database.windows.net" database_port = "1433" database_name = "dat...

Data Engineering

7100 Views
4 replies
1 kudos

01-27-2023 10:40:12 AM

View Replies

Latest Reply

AB_MN
New Contributor III

01-27-2023 12:50:14 PM

1 kudos

That did the trick. Thank you!

1 kudos

01-27-2023 12:50:14 PM

3 More Replies

by dng • New Contributor III

11-30-2022 4:39:44 PM

7748 Views
6 replies
10 kudos

Databricks JDBC Driver v2.6.29 Cloud Fetch failing for Windows Operating System

Hi everyone, I've been stuck for the past two days on this issue with my Databricks JDBC driver and I'm hoping someone can give me more insight into how to troubleshoot. I am using the Databricks JDBC driver in RStudio and the connection was working ...

Data Engineering

7748 Views
6 replies
10 kudos

11-30-2022 4:39:44 PM

View Replies

Latest Reply

Prabakar
Databricks Employee

01-30-2023 9:09:08 AM

10 kudos

@Debbie Ng From your message I see there was a windows update and this failure started. based on the conversation you tried latest version of the driver and still you face the problem. I believe this is something related to the Java version compatib...

10 kudos

01-30-2023 9:09:08 AM

5 More Replies

by brian_0305 • New Contributor II

02-22-2023 11:45:58 AM

4634 Views
3 replies
2 kudos

Use JDBC connect to databrick default cluster and read table into pyspark dataframe. All the column turned into same as column name

I used code like below to Use JDBC connect to databrick default cluster and read table into pyspark dataframeurl = 'jdbc:databricks://[workspace domain]:443/default;transportMode=http;ssl=1;AuthMech=3;httpPath=[path];AuthMech=3;UID=token;PWD=[your_ac...

Data Engineering

4634 Views
3 replies
2 kudos

02-22-2023 11:45:58 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-24-2023 9:02:41 PM

2 kudos

@yu zhang :It looks like the issue with the first code snippet you provided is that it is not specifying the correct query to retrieve the data from your database.When using the load() method with the jdbc data source, you need to provide a SQL quer...

2 kudos

04-24-2023 9:02:41 PM

2 More Replies

by hfrid • New Contributor II

04-11-2023 5:40:27 AM

6895 Views
1 replies
2 kudos

JDBC connector seems to be a bottleneck when trying to insert dataframe to Azure SQL Server

Hi! I am inserting a pyspark dataframe to Azure sql server and it takes a very long time. The database is a s4 but my dataframe that is 17 million rows and 30 columns takes up to 50 minutes to insert.Is there a way to significantly speed this up? I a...

Data Engineering

6895 Views
1 replies
2 kudos

04-11-2023 5:40:27 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-15-2023 6:06:47 PM

2 kudos

@Hjalmar Friden :There are several ways to improve the performance of inserting data into Azure SQL Server using JDBC connector:Increase the batch size: By default, the JDBC connector sends data in batches of 1000 rows at a time. You can increase th...

2 kudos

04-15-2023 6:06:47 PM

by pandu • New Contributor II

03-08-2023 6:44:23 AM

2802 Views
2 replies
3 kudos

connect to Oracle database using JDBC and perform merge condition

I would like to connect to oracle database using JDBC driver and write a code to perform merge condition using python.

Data Engineering

2802 Views
2 replies
3 kudos

03-08-2023 6:44:23 AM

View Replies

Latest Reply

Vartika
Databricks Employee

03-31-2023 1:43:10 AM

3 kudos

Hi @Venkata Krishna Jonnalagadda Hope you are well.Just checking in. If @John Lourdu's answer helped, would you let us know and mark the answer as best? If not, would you be happy to give us more information?Thanks!

3 kudos

03-31-2023 1:43:10 AM

1 More Replies

by haggholm • New Contributor

02-24-2023 5:29:11 PM

2803 Views
2 replies
1 kudos

Resolved! Query with ORDER BY fails with HiveThriftServerError "requirement failed: Subquery … has not finished"

Using ODBC or JDBC to read from a table fails when I attempt to use an ORDER BY clause. In one sample case, I have a fairly small table (just 1946 rows).select * from some_table order by some_fieldResult:java.lang.IllegalArgumentException: requiremen...

Data Engineering

2803 Views
2 replies
1 kudos

02-24-2023 5:29:11 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-12-2023 9:39:52 PM

1 kudos

Hi @petter@hightouch.com Petter Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it doe...

1 kudos

03-12-2023 9:39:52 PM

1 More Replies

by RamyaN • New Contributor II

01-05-2023 4:29:51 AM

3514 Views
2 replies
3 kudos

How to read enum[] (enum of array) datatype from postgres using spark

We are trying to read a column which is enum of array datatype from postgres as string datatype to target. We could able to achieve this by expilcitly using concat function while extracting like belowval jdbcDF3 = spark.read .format("jdbc") .option(...

Data Engineering

3514 Views
2 replies
3 kudos

01-05-2023 4:29:51 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-05-2023 5:12:26 AM

3 kudos

You can try custom schema for JDBC read.option("customSchema", "colname STRING")

3 kudos

01-05-2023 5:12:26 AM

1 More Replies

by jneira • New Contributor III

11-30-2022 6:07:38 AM

2305 Views
2 replies
2 kudos

"org.apache.hadoop.hive.ql.metadata.HiveException: at least one column must be specified for the table" non deterministic error in a `insert ... select ... ` clause

Hi, first of all thahks for your work in databricks sql.Unfortunately i am having a problem running insert-selects statements programatically using the jdbc driver.They all have the form:`insert into `mytable` select 1, 'foo', moreLiterals`The statem...

Data Engineering

2305 Views
2 replies
2 kudos

11-30-2022 6:07:38 AM

View Replies

Latest Reply

jneira
New Contributor III

12-29-2022 6:47:01 AM

2 kudos

thanks for the suggestion, could tell me more about how to check logs in the cluster?

2 kudos

12-29-2022 6:47:01 AM

1 More Replies

by huyd • New Contributor III

11-22-2022 2:47:12 PM

1452 Views
0 replies
4 kudos

Optimizing a batch load process, reading with the JDBC driver

I am doing a batch load, using the JDBC driver from a database table. I am noticing in Sparkui, that there is both memory and disk spill, but only on one executor. I am also, noticing that when trying to use the JDBC parallel read, it seems to run sl...

Data Engineering

1452 Views
0 replies
4 kudos

11-22-2022 2:47:12 PM

by sriramkumar • New Contributor II

05-25-2022 11:27:22 AM

1430 Views
2 replies
1 kudos

Reasons for new Databricks driver

What are the reasons behind Databricks going for their own driver? What differences are made when switching between the previous Spark driver and the new Databricks driver?Is there any specific document I can look at or just the release notes?Also, w...

Data Engineering

1430 Views
2 replies
1 kudos

05-25-2022 11:27:22 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2022 8:39:18 AM

1 kudos

Hey @Sriramkumar Thamizharasan Hope all is well! Just wanted to check in if you were able to resolve your issue would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

1 kudos

07-22-2022 8:39:18 AM

1 More Replies

by sriramkumar • New Contributor II

05-25-2022 7:57:21 AM

3379 Views
3 replies
0 kudos

New Databricks Driver gives SQLNonTransientConnectionException when trying to connect to Databricks Instance

import com.databricks.client.jdbc.DataSource; import java.sql.*; public class testDatabricks { public static void main(String[] args) throws SQLException { String dbUrl = "jdbc:databricks://<hostname>:443;HttpPath=<HttpPath>;"; // Cop...

Data Engineering

3379 Views
3 replies
0 kudos

05-25-2022 7:57:21 AM

View Replies

Latest Reply

Atanu
Databricks Employee

05-26-2022 3:04:07 AM

0 kudos

This looks like due to maintenance on US . Are you still facing the issue @Sriramkumar Thamizharasan Is your workspace on eastus and eastus2 ?

0 kudos

05-26-2022 3:04:07 AM

2 More Replies

by Vamsee • New Contributor II

11-17-2021 7:57:48 AM

6673 Views
4 replies
4 kudos

Resolved! I am trying to connect to Databricks SQL using this JDBC driver. com.simba.spark.jdbc.Driver. I can download it and install it on my local laptop , but I need it to be available to maven repository so I can include the maven path in my project.

Please lett me know if the JDB driver is available in maven repository and what needs to be included in pom.xml to download it.

Data Engineering

6673 Views
4 replies
4 kudos

11-17-2021 7:57:48 AM

View Replies

Latest Reply

BilalAslamDbrx
Databricks Employee

05-23-2022 9:04:11 AM

4 kudos

@Vamsee krishna kanth Arcot good news, the driver is up on Maven: https://search.maven.org/artifact/com.databricks/databricks-jdbc

4 kudos

05-23-2022 9:04:11 AM

3 More Replies

by findinpath • Contributor

05-06-2022 1:38:30 AM

6462 Views
2 replies
3 kudos

Databricks 2.6.25 JDBC driver can't create tables with `GENERATED` columns

I'm using the Databricks JDBC driver recently made available via Maven:https://mvnrepository.com/artifact/com.databricks/databricks-jdbc/2.6.25While trying to create a table with `GENERATED` columns I receive the following exception:Caused by: java.s...

Data Engineering

6462 Views
2 replies
3 kudos

05-06-2022 1:38:30 AM

View Replies

Latest Reply

findinpath
Contributor

05-18-2022 9:43:40 PM

3 kudos

I was under the impression that this has been recognised as a BUG and is being handled by Databricks.What do I need to do for reporting the issue officially as a BUG?

3 kudos

05-18-2022 9:43:40 PM

1 More Replies

by SCOR • New Contributor II

02-08-2022 1:11:50 AM

2434 Views
3 replies
4 kudos

SparkJDBC42.jar Issue ?

Hi there!I am using the SparkJDBC42.jar in my Java application to use my delta lake tables , The connection is made through databricks sql endpoint in where I created a database and store in it my delta tables. I have a simple code to open connection...

Data Engineering

2434 Views
3 replies
4 kudos

02-08-2022 1:11:50 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

03-07-2022 3:10:08 PM

4 kudos

Hi @Seifeddine SNOUSSI ,Are you still having issue or you were able to resolve this issue? please let us know

4 kudos

03-07-2022 3:10:08 PM

2 More Replies