Data Engineering

Forum Posts

Sorted by:

by mattmunz • New Contributor III

07-29-2022 7:15:25 PM

4469 Views
2 replies
4 kudos

JDBC Error: Error occured while deserializing arrow data

I am getting the following error in my Java application.java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500618) Error occured while deserializing arrow data: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not availableI beli...

Data Engineering

4469 Views
2 replies
4 kudos

07-29-2022 7:15:25 PM

View Replies

Latest Reply

cvcore
New Contributor II

a month ago

4 kudos

For anyone encountering this issue in 2025, I was able to solve it by using the --add-opens=jdk.unsupported/sun.misc=ALL-UNNAMEDoption in combination with the latest jdbc driver (v2.7.1). I was using the driver in dbeaver, but I assume the issue coul...

4 kudos

a month ago

1 More Replies

by Sweetnesh • New Contributor

05-29-2023 4:50:32 AM

2096 Views
2 replies
0 kudos

Not able to read S3 object through AssumedRoleCredentialProvider

SparkSession spark = SparkSession.builder() .appName("SparkS3Example") .master("local[1]") .getOrCreate(); spark.sparkContext().hadoopConfiguration().set("fs.s3a.access.key", S3_ACCOUNT_KEY); spark.sparkContext().hadoopConf...

Data Engineering

2096 Views
2 replies
0 kudos

05-29-2023 4:50:32 AM

View Replies

Latest Reply

Vartika
Databricks Employee

05-30-2023 1:52:57 AM

0 kudos

Hi @Sweetnesh Dholariya,Does @Debayan Mukherjee's response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?Thanks!

0 kudos

05-30-2023 1:52:57 AM

1 More Replies

by Databrickguy • New Contributor II

01-13-2023 9:42:13 AM

1311 Views
1 replies
0 kudos

How to use Java MaskFormatter in sparksql?

I create a function based on Java MaskFormatter function in Databricks/Scala.But when I call it from sparksql, I received error messageError in SQL statement: AnalysisException: Undefined function: formatAccount. This function is neither a built-in/t...

Data Engineering

1311 Views
1 replies
0 kudos

01-13-2023 9:42:13 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 7:57:56 AM

0 kudos

@Tim zhang :The issue is that the formatAccount function is defined as a Scala function, but SparkSQL is looking for a SQL function. You need to register the Scala function as a SQL function so that it can be called from SparkSQL. You can register t...

0 kudos

04-10-2023 7:57:56 AM

by Rahul2025 • New Contributor III

02-02-2023 10:27:24 PM

4039 Views
4 replies
4 kudos

Make environment variables defined in init script available to Spark JVM job?

Hi,We're using Databricks Runtime version 11.3LTS and executing a Spark Java Job using a Job Cluster. To automate the execution of this job, we need to define (source in from bash config files) some environment variables through an init script (clust...

Data Engineering

4039 Views
4 replies
4 kudos

02-02-2023 10:27:24 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 3:06:35 AM

4 kudos

Hi @Rahul K Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

4 kudos

04-10-2023 3:06:35 AM

3 More Replies

by Rahul2025 • New Contributor III

02-02-2023 10:37:28 PM

5742 Views
11 replies
1 kudos

Limitation on size of init script

Data Engineering

5742 Views
11 replies
1 kudos

02-02-2023 10:37:28 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 1:39:37 AM

1 kudos

Hi @Rahul K Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

1 kudos

04-10-2023 1:39:37 AM

10 More Replies

by gauthamchettiar • New Contributor II

12-13-2022 5:57:27 AM

1935 Views
0 replies
1 kudos

Spark always performing broad casts irrespective of spark.sql.autoBroadcastJoinThreshold during streaming merge operation with DeltaTable.

I am trying to do a streaming merge between delta tables using this guide - https://docs.delta.io/latest/delta-update.html#upsert-from-streaming-queries-using-foreachbatchOur Code Sample (Java): Dataset<Row> sourceDf = sparkSession ...

Data Engineering

1935 Views
0 replies
1 kudos

12-13-2022 5:57:27 AM

by rammy • Contributor III

11-21-2022 10:41:03 PM

1880 Views
1 replies
5 kudos

Not able to parse .doc extension file using scala in databricks notebook?

I could able to parse .doc extension files using Java programming with the help of POI libraries but when trying to convert Java code into Scala i expect it has to work with same java libraries with Scala programming but it is showing with below erro...

Data Engineering

1880 Views
1 replies
5 kudos

11-21-2022 10:41:03 PM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

12-03-2022 12:57:24 AM

5 kudos

Hi @Ramesh Bathini In pyspark, we have a docx module. I found that to be working perfectly fine. Can you try using that ?Documentation and stuff could be found online. Cheers...

5 kudos

12-03-2022 12:57:24 AM

by BkP • Contributor

10-28-2022 12:55:01 AM

2432 Views
2 replies
3 kudos

Scala Connectivity to Databricks Bronze Layer Raw Data from a Non-Databricks Spark environment

Hi All, We are developing a new Scala/Java program which needs to read & process the raw data stored in source ADLS (which is a Databricks Environment) in parallel as the volume of the source data is very high (in GBs & TBs). What kind of connection ...

Data Engineering

2432 Views
2 replies
3 kudos

10-28-2022 12:55:01 AM

View Replies

Latest Reply

BkP
Contributor

10-31-2022 12:31:28 PM

3 kudos

hello experts. any advise on this question ?? tagging some folks from whom I have received answers before. Please help on this requirement or tag someone who can help on this@Kaniz Fatma , @Vartika Nain , @Bilal Aslam

3 kudos

10-31-2022 12:31:28 PM

1 More Replies

by witnessthee • New Contributor II

10-10-2022 9:10:44 AM

7022 Views
3 replies
2 kudos

Resolved! Error when using pyflink on databricks, An error occurred while trying to connect to the Java server

Hi, right now I am trying to run a pyflink script that can connect to a kafka server. When I run that script, I got an error "An error occurred while trying to connect to the Java server 127.0.0.1:35529". Do I need to install a extra jdk for that? er...

Data Engineering

7022 Views
3 replies
2 kudos

10-10-2022 9:10:44 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-11-2022 6:12:32 AM

2 kudos

did you get Flink running on the Databricks cluster? Because that seems to be the issue here.

2 kudos

10-11-2022 6:12:32 AM

2 More Replies

by data_serf • New Contributor

08-04-2022 2:17:40 PM

8695 Views
3 replies
1 kudos

Resolved! How to integrate java 11 code in Databricks

Hi all,We're trying to attach java libraries which are compiled/packaged using Java 11.After doing some research it looks like even the most recent runtimes use Java 8 which can't run the Java 11 code ("wrong version 55.0, should be 52.0" errors)Is t...

Data Engineering

8695 Views
3 replies
1 kudos

08-04-2022 2:17:40 PM

View Replies

Latest Reply

matthewrj
New Contributor II

09-15-2022 8:28:43 PM

1 kudos

I have tried setting JNAME=zulu11-ca-amd64 under Cluster > Advanced options > Spark > Environment variables but it doesn't seem to work. I still get errors indicating Java 8 is the JRE and in the Spark UI under "Environment" I still see:Java Home: /u...

1 kudos

09-15-2022 8:28:43 PM

2 More Replies

by isaac_gritz • Databricks Employee

08-23-2022 1:12:57 AM

1924 Views
1 replies
6 kudos

Versions of Spark, Python, Scala, R in each Databricks Runtime

What version of Spark, Python, Scala, R are included in each Databricks Runtime? What libraries are pre-installed?You can find this info at the Databricks runtime releases page (AWS | Azure | GCP).Let us know if you have any additional questions on t...

Data Engineering

1924 Views
1 replies
6 kudos

08-23-2022 1:12:57 AM

View Replies

Latest Reply

maxdata
Databricks Employee

08-25-2022 2:10:59 PM

6 kudos

Wow! Thanks for the help @Isaac Gritz !

6 kudos

08-25-2022 2:10:59 PM

by sage5616 • Valued Contributor

07-12-2022 8:40:36 AM

9612 Views
5 replies
7 kudos

Resolved! SQL Error when querying any tables/views on a Databricks cluster via Dbeaver.

I am able to connect to the cluster, browse its hive catalog, see tables/views and columns/datatypesRunning a simple select statement from a view on a parquet file produces this error and no other results:"SQL Error [500540] [HY000]: [Databricks][Dat...

Data Engineering

9612 Views
5 replies
7 kudos

07-12-2022 8:40:36 AM

View Replies

Latest Reply

sage5616
Valued Contributor

07-20-2022 7:37:37 AM

7 kudos

Update. I have tried SQL Workbench/J and encountered exactly the same error(s) as with Dbeaver. I have also tried JetBrains DataGrip and it worked flawlessly. Able to connect, browse the databases and query tables/views. https://docs.microsoft.com/en...

7 kudos

07-20-2022 7:37:37 AM

4 More Replies

by codevisionz • New Contributor

07-23-2022 4:58:27 AM

611 Views
0 replies
0 kudos

Our Python Code Examples covers basic concepts, control structures, functions, lists, classes, objects, inheritance, polymorphism, file operations, da...

Our Python Code Examples covers basic concepts, control structures, functions, lists, classes, objects, inheritance, polymorphism, file operations, data structures, sorting algorithms, mathematical functions, mathematical sequences, threads, exceptio...

Data Engineering

611 Views
0 replies
0 kudos

07-23-2022 4:58:27 AM

by mani238 • New Contributor III

05-08-2022 5:50:10 AM

5086 Views
4 replies
4 kudos

Resolved! How do trigger Azure Databricks Jobs using Java Programming. Can you please give sample code to connect the Azure Databricks using Java

Data Engineering

5086 Views
4 replies
4 kudos

05-08-2022 5:50:10 AM

View Replies

Latest Reply

mani238
New Contributor III

05-12-2022 5:59:32 PM

4 kudos

Hi @Kaniz Fatma , I got the solution based on the @Hubert Dudek Answer .Thanks @Hubert Dudek . Another Doubt:How do i Automate the Azure Synapse Concept . Please help me ..Thanks

4 kudos

05-12-2022 5:59:32 PM

3 More Replies

by _r_vind1199 • New Contributor II

04-14-2022 11:25:39 AM

3988 Views
3 replies
3 kudos

Resolved! Pyspark installation issue

When I try to start pyspark session in pycharm. It throws me this error "RuntimeError("Java gateway process exited before sending its port number"). Could anyone help me to solve this?

Data Engineering

3988 Views
3 replies
3 kudos

04-14-2022 11:25:39 AM

View Replies

Latest Reply

_r_vind1199
New Contributor II

04-15-2022 6:47:18 AM

3 kudos

@Aashita Ramteke , Pyspark version 3.2.1

3 kudos

04-15-2022 6:47:18 AM

2 More Replies