cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

TylerTamasaucka
by New Contributor II
  • 28204 Views
  • 5 replies
  • 2 kudos

org.apache.spark.sql.AnalysisException: Undefined function: 'MAX'

I am trying to create a JAR for a Azure Databricks job but some code that works when using the notebook interface does not work when calling the library through a job. The weird part is that the job will complete the first run successfully but on an...

  • 28204 Views
  • 5 replies
  • 2 kudos
Latest Reply
skaja
New Contributor II
  • 2 kudos

I am facing similar issue when trying to use from_utc_timestamp function. I am able to call the function from databricks notebook but when I use the same function inside my java jar and running as a job in databricks, it is giving below error. Analys...

  • 2 kudos
4 More Replies
Mohit_m
by Valued Contributor II
  • 24193 Views
  • 3 replies
  • 4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

  • 24193 Views
  • 3 replies
  • 4 kudos
Latest Reply
Bruno-Castro
New Contributor II
  • 4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

  • 4 kudos
2 More Replies
jwilliam
by Contributor
  • 2296 Views
  • 2 replies
  • 1 kudos

Resolved! [BUG] Databricks install WHL as JAR in Python Wheel Task?

I'm using Python Wheel Task in Databricks job with WHEEL dependencies. However, the cluster installed the dependencies as JAR instead of WHEEL. Is this an expected behavior or a bug?

  • 2296 Views
  • 2 replies
  • 1 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 1 kudos

There you can see a complete template project with a python wheel task and Databricks Asset Bundles. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 1 kudos
1 More Replies
seefoods
by New Contributor III
  • 1397 Views
  • 1 replies
  • 0 kudos

run jar file into databricks

I have created a job which run a jar files, but i have this error NoClassDefFoundError: com/google/cloud/hadoop/gcsio/GoogleCloudStorageFileSystemOptions$TimestampUpdatePredicateCaused by: ClassNotFoundException: com.google.cloud.hadoop.gcsio.GoogleC...

  • 1397 Views
  • 1 replies
  • 0 kudos
Latest Reply
aiNdata
New Contributor II
  • 0 kudos

Hey Aubert, seems you are missing dependent class in jar. Either package the dependent classes in jar or add them into class path.

  • 0 kudos
Chanu
by New Contributor II
  • 1815 Views
  • 2 replies
  • 2 kudos

Databricks JAR task type functionality

Hi, I would like to understand Databricks JAR based workflow tasks. Can I interpret JAR based runs to be something like a spark-submit on a cluster? In the logs, I was expecting to see the spark-submit --class com.xyz --num-executors 4 etc., And, the...

  • 1815 Views
  • 2 replies
  • 2 kudos
Latest Reply
Chanu
New Contributor II
  • 2 kudos

Hi, I did try using the Workflows>Jobs>CreateTask>JarTaskType>UploadedMyJAR and Class and created JobCluster and tested this task. This JAR reads some tables as input, does some transformations and output as writing some other tables. I would like t...

  • 2 kudos
1 More Replies
ncouture
by Contributor
  • 5034 Views
  • 3 replies
  • 1 kudos

Resolved! How to install a JAR library via a global init script?

I have a JAR I want to be installed as a library on all clusters. I have tried both wget /databricks/jars/ some_repoandcp /dbfs/FileStore/jars/name_of_jar.jar /databricks/jars/clusters start up but the JAR is not installed as a library. I am aware th...

  • 5034 Views
  • 3 replies
  • 1 kudos
Latest Reply
ncouture
Contributor
  • 1 kudos

Found a solution.echo /databricks/databricks-hive /databricks/jars /databricks/glue | xargs -n 1 cp /dbfs/FileStore/jars/NAME_OF_THE_JAR.jarhad to first add the jar as a library through the GUI via Create -> Library then uploaded the downloaded JAR. ...

  • 1 kudos
2 More Replies
Monika8991
by New Contributor II
  • 2440 Views
  • 2 replies
  • 1 kudos

Getting spark/scala versioning issues while running the spark jobs through Jar

 We tried moving our scala script from standalone cluster to databricks platform. Our script is compatible with following version:Spark: 2.4.8 Scala: 2.11.12The databricks cluster has spark/scala following with version:Spark: 3.2.1. Scala: 2.121: we ...

  • 2440 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Monika Samant​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 1 kudos
1 More Replies
yannickmo
by New Contributor III
  • 6523 Views
  • 7 replies
  • 14 kudos

Resolved! Adding JAR from Azure DevOps Artifacts feed to Databricks job

Hello,We have some Scala code which is compiled and published to an Azure DevOps Artifacts feed.The issue is we're trying to now add this JAR to a Databricks job (through Terraform) to automate the creation.To do this I'm trying to authenticate using...

  • 6523 Views
  • 7 replies
  • 14 kudos
Latest Reply
alexott
Databricks Employee
  • 14 kudos

As of right now, Databricks can't use non-public Maven repositories as resolving of the maven coordinates happens in the control plane. That's different from the R & Python libraries. As workaround you may try to install libraries via init script or ...

  • 14 kudos
6 More Replies
oussamak
by New Contributor II
  • 3168 Views
  • 1 replies
  • 2 kudos

How to install JAR libraries from ADLS? I'm having an error

I mounted the ADLS to my Azure Databricks resource and I keep on getting this error when I try to install a JAR from a container:Library installation attempted on the driver node of cluster 0331-121709-buk0nvsq and failed. Please refer to the followi...

  • 3168 Views
  • 1 replies
  • 2 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 2 kudos

This widget could not be displayed.
I mounted the ADLS to my Azure Databricks resource and I keep on getting this error when I try to install a JAR from a container:Library installation attempted on the driver node of cluster 0331-121709-buk0nvsq and failed. Please refer to the followi...

This widget could not be displayed.
  • 2 kudos
This widget could not be displayed.
lily1
by New Contributor III
  • 4367 Views
  • 3 replies
  • 2 kudos

Resolved! NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor

When I execute a function in google-cloud-bigquery:2.7.0 jar, it executes a function in gax:2.12.2 jar and then this gax jar file executes a function in guava jar. And this guava jar file is a Databricks default library which is located at /databrick...

  • 4367 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Lily Kim​ Hope you are doing well!Thank you for posting your question. We are happy that you were able to find the solution.Would you please like to mark the answer as best?We'd love to hear from you.

  • 2 kudos
2 More Replies
brickster_2018
by Databricks Employee
  • 3333 Views
  • 1 replies
  • 1 kudos

Resolved! Classpath issues when running spark-submit

How to identify the jars used to load a particular class. I am sure I packed the classes correctly in my application jar. However, looks like the class is loaded from a different jar. I want to understand the details so that I can ensure to use the r...

  • 3333 Views
  • 1 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

Adding the below configurations at the cluster level can help to print more logs to identify the jars from which the class is loaded. spark.executor.extraJavaOptions=-verbose:class spark.driver.extraJavaOptions=-verbose:class

  • 1 kudos
Labels