cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Mohit_m
by Valued Contributor II
  • 15190 Views
  • 3 replies
  • 4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

  • 15190 Views
  • 3 replies
  • 4 kudos
Latest Reply
Bruno-Castro
New Contributor II
  • 4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

  • 4 kudos
2 More Replies
jwilliam
by Contributor
  • 1260 Views
  • 3 replies
  • 1 kudos

Resolved! [BUG] Databricks install WHL as JAR in Python Wheel Task?

I'm using Python Wheel Task in Databricks job with WHEEL dependencies. However, the cluster installed the dependencies as JAR instead of WHEEL. Is this an expected behavior or a bug?

  • 1260 Views
  • 3 replies
  • 1 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 1 kudos

There you can see a complete template project with a python wheel task and Databricks Asset Bundles. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 1 kudos
2 More Replies
seefoods
by New Contributor III
  • 835 Views
  • 1 replies
  • 0 kudos

run jar file into databricks

I have created a job which run a jar files, but i have this error NoClassDefFoundError: com/google/cloud/hadoop/gcsio/GoogleCloudStorageFileSystemOptions$TimestampUpdatePredicateCaused by: ClassNotFoundException: com.google.cloud.hadoop.gcsio.GoogleC...

  • 835 Views
  • 1 replies
  • 0 kudos
Latest Reply
aiNdata
New Contributor II
  • 0 kudos

Hey Aubert, seems you are missing dependent class in jar. Either package the dependent classes in jar or add them into class path.

  • 0 kudos
Chanu
by New Contributor II
  • 1143 Views
  • 2 replies
  • 2 kudos

Databricks JAR task type functionality

Hi, I would like to understand Databricks JAR based workflow tasks. Can I interpret JAR based runs to be something like a spark-submit on a cluster? In the logs, I was expecting to see the spark-submit --class com.xyz --num-executors 4 etc., And, the...

  • 1143 Views
  • 2 replies
  • 2 kudos
Latest Reply
Chanu
New Contributor II
  • 2 kudos

Hi, I did try using the Workflows>Jobs>CreateTask>JarTaskType>UploadedMyJAR and Class and created JobCluster and tested this task. This JAR reads some tables as input, does some transformations and output as writing some other tables. I would like t...

  • 2 kudos
1 More Replies
ncouture
by Contributor
  • 3213 Views
  • 3 replies
  • 1 kudos

Resolved! How to install a JAR library via a global init script?

I have a JAR I want to be installed as a library on all clusters. I have tried both wget /databricks/jars/ some_repoandcp /dbfs/FileStore/jars/name_of_jar.jar /databricks/jars/clusters start up but the JAR is not installed as a library. I am aware th...

  • 3213 Views
  • 3 replies
  • 1 kudos
Latest Reply
ncouture
Contributor
  • 1 kudos

Found a solution.echo /databricks/databricks-hive /databricks/jars /databricks/glue | xargs -n 1 cp /dbfs/FileStore/jars/NAME_OF_THE_JAR.jarhad to first add the jar as a library through the GUI via Create -> Library then uploaded the downloaded JAR. ...

  • 1 kudos
2 More Replies
TylerTamasaucka
by New Contributor
  • 24708 Views
  • 4 replies
  • 0 kudos

org.apache.spark.sql.AnalysisException: Undefined function: 'MAX'

I am trying to create a JAR for a Azure Databricks job but some code that works when using the notebook interface does not work when calling the library through a job. The weird part is that the job will complete the first run successfully but on an...

  • 24708 Views
  • 4 replies
  • 0 kudos
Latest Reply
skaja
New Contributor II
  • 0 kudos

I am facing similar issue when trying to use from_utc_timestamp function. I am able to call the function from databricks notebook but when I use the same function inside my java jar and running as a job in databricks, it is giving below error. Analys...

  • 0 kudos
3 More Replies
Monika8991
by New Contributor II
  • 1634 Views
  • 2 replies
  • 1 kudos

Getting spark/scala versioning issues while running the spark jobs through Jar

 We tried moving our scala script from standalone cluster to databricks platform. Our script is compatible with following version:Spark: 2.4.8 Scala: 2.11.12The databricks cluster has spark/scala following with version:Spark: 3.2.1. Scala: 2.121: we ...

  • 1634 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Monika Samant​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 1 kudos
1 More Replies
yannickmo
by New Contributor III
  • 4243 Views
  • 8 replies
  • 14 kudos

Resolved! Adding JAR from Azure DevOps Artifacts feed to Databricks job

Hello,We have some Scala code which is compiled and published to an Azure DevOps Artifacts feed.The issue is we're trying to now add this JAR to a Databricks job (through Terraform) to automate the creation.To do this I'm trying to authenticate using...

  • 4243 Views
  • 8 replies
  • 14 kudos
Latest Reply
alexott
Valued Contributor II
  • 14 kudos

As of right now, Databricks can't use non-public Maven repositories as resolving of the maven coordinates happens in the control plane. That's different from the R & Python libraries. As workaround you may try to install libraries via init script or ...

  • 14 kudos
7 More Replies
oussamak
by New Contributor II
  • 1903 Views
  • 2 replies
  • 3 kudos

How to install JAR libraries from ADLS? I'm having an error

I mounted the ADLS to my Azure Databricks resource and I keep on getting this error when I try to install a JAR from a container:Library installation attempted on the driver node of cluster 0331-121709-buk0nvsq and failed. Please refer to the followi...

  • 1903 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Oussama KIASSI​ , The error message says :- Failure to initialize configurationInvalid configuration value detected for fs.azure.account.keyYou can't use the storage account access key to access data using the abfss protocol. You need to provide ...

  • 3 kudos
1 More Replies
lily1
by New Contributor III
  • 2671 Views
  • 3 replies
  • 2 kudos

Resolved! NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor

When I execute a function in google-cloud-bigquery:2.7.0 jar, it executes a function in gax:2.12.2 jar and then this gax jar file executes a function in guava jar. And this guava jar file is a Databricks default library which is located at /databrick...

  • 2671 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Lily Kim​ Hope you are doing well!Thank you for posting your question. We are happy that you were able to find the solution.Would you please like to mark the answer as best?We'd love to hear from you.

  • 2 kudos
2 More Replies
User16869510359
by Esteemed Contributor
  • 2353 Views
  • 1 replies
  • 1 kudos

Resolved! Classpath issues when running spark-submit

How to identify the jars used to load a particular class. I am sure I packed the classes correctly in my application jar. However, looks like the class is loaded from a different jar. I want to understand the details so that I can ensure to use the r...

  • 2353 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 1 kudos

Adding the below configurations at the cluster level can help to print more logs to identify the jars from which the class is loaded. spark.executor.extraJavaOptions=-verbose:class spark.driver.extraJavaOptions=-verbose:class

  • 1 kudos
Labels