cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Deepak_Kandpal
by New Contributor III
  • 10279 Views
  • 3 replies
  • 3 kudos

Resolved! Invalid configuration value detected for fs.azure.account.key with com.crealytics:spark-excel

I have setup my Databricks notebook to use Service Principal to access ADLS using below configuration.service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>")   spark.conf.set("fs.azure.account.auth.type.<storage-accou...

  • 10279 Views
  • 3 replies
  • 3 kudos
Latest Reply
Harsha_Dbrs
New Contributor II
  • 3 kudos

Below is the implementation of same code in scala:spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<accountName>.dfs.core.windows.net",<accountKey>)

  • 3 kudos
2 More Replies
dataslicer
by Contributor
  • 8619 Views
  • 4 replies
  • 1 kudos

Successfully installed Maven:Coordinates:com.crealytics:spark-excel_2.12:3.2.0_0.16.0 on Azure DBX 9.1 LTS runtime but getting error for missing dependency: org.apache.commons.io.IOUtils.byteArray(I)

I am using Azure DBX 9.1 LTS and successfully installed the following library on the cluster using Maven coordinates: com.crealytics:spark-excel_2.12:3.2.0_0.16.0When I executed the following line:excelSDF = spark.read.format("excel").option("dataAdd...

  • 8619 Views
  • 4 replies
  • 1 kudos
Latest Reply
RamRaju
New Contributor II
  • 1 kudos

Hi @dataslicer  were you able to solve this issue?I am using 9.1 lts databricks version with Spark 3.1.2 and scala 2.12. I have installed com.crealytics:spark-excel-2.12.17-3.1.2_2.12:3.1.2_0.18.1.  It was working fine but now facing same exception a...

  • 1 kudos
3 More Replies
brickster_2018
by Databricks Employee
  • 4125 Views
  • 2 replies
  • 3 kudos

Resolved! Can I install notebook scoped JAR/Maven libraries?

The notebook scoped libraries are very handy. Is it possible to leverage the same for maven jars or application jars as well?

  • 4125 Views
  • 2 replies
  • 3 kudos
Latest Reply
Pratik_Ghosh
New Contributor II
  • 3 kudos

Any further update on this topic?

  • 3 kudos
1 More Replies
eyalo
by New Contributor II
  • 4902 Views
  • 6 replies
  • 0 kudos

Why the SFTP ingest doesn't work?

Hi, I did the following code but it seems like the cluster is running for a long period of time and then stops without any results. Attached my following code: (I used 'com.springml.spark.sftp' library and install it as Maven)Also i whitelisted my lo...

image
  • 4902 Views
  • 6 replies
  • 0 kudos
Latest Reply
eyalo
New Contributor II
  • 0 kudos

@Debayan Mukherjee​ Hi, I don't know if you got my reply so i am bouncing my message to you again.Thanks.

  • 0 kudos
5 More Replies
ros
by New Contributor III
  • 2471 Views
  • 2 replies
  • 3 kudos

Apache Hudi Table creation using hudi maven library

I installed hudi maven library org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.0 in Dbricks Runtime Ver : 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) with spark config :spark.sql.catalog.spark_catalog org.apache.spark.sql.hudi.catalog.HoodieCat...

  • 2471 Views
  • 2 replies
  • 3 kudos
Latest Reply
ros
New Contributor III
  • 3 kudos

@Shanmugavel Chandrakasu​ %sql create table hudi_cow_pt_tbl ( id bigint, name string, ts bigint, dt string, hh string ) using hudi tblproperties ( type = 'cow', primaryKey = 'id', preCombineField = 'ts' ) partitioned by (dt, hh) location '/mnt/data/h...

  • 3 kudos
1 More Replies
Himanshu1
by New Contributor II
  • 2362 Views
  • 1 replies
  • 3 kudos

How to read XML files in delta live tables?

Even after maven library installation using the Auto installation.spark.read.option("rowTag", "tag").xml("dbfs:/mnt/dev/bronze/xml/fileName.xml")not working.

image.png
  • 2362 Views
  • 1 replies
  • 3 kudos
Latest Reply
DD_Sharma
New Contributor III
  • 3 kudos

At present DLT does not support installing the maven library from the DLT pipeline. In the future this feature will come for sure so please wait for some time and keep checking data bricks runtime release docs https://docs.databricks.com/release-note...

  • 3 kudos
nachog99
by New Contributor II
  • 20275 Views
  • 4 replies
  • 1 kudos

Databricks cluster starts with docker

Hi there!I hope u are doing wellI'm trying to start a cluster with a docker image to install all the libraries that I have to use.I have the following Dockerfile to install only python libraries as you can seeFROM databricksruntime/standard WORKDIR /...

image.png
  • 20275 Views
  • 4 replies
  • 1 kudos
Latest Reply
xneg
Contributor
  • 1 kudos

Hi! I am facing a similar issue.I tried to use this oneFROM databricksruntime/standard:10.4-LTS   ENV DEBIAN_FRONTEND=noninteractive RUN apt update && apt install -y maven && rm -rf /var/lib/apt/lists/*   RUN /databricks/python3/bin/pip install datab...

  • 1 kudos
3 More Replies
ABVectr
by New Contributor III
  • 3502 Views
  • 6 replies
  • 1 kudos

Resolved! Maven Package install failing on DBR 11.3 LTS

Hi Databricks Community,I ran into the following issue when setting up a new cluster with the latest LTS Databricks runtime (11.3). When trying to install the package with the coordinates com.microsoft.azure.kusto:kusto-spark_3.0_2.12:3.1.4 from Mave...

  • 3502 Views
  • 6 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Andrei Bondarenko​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you....

  • 1 kudos
5 More Replies
blackcoffeeAR
by Contributor
  • 3946 Views
  • 5 replies
  • 2 kudos

Cannot install com.microsoft.azure.kusto:kusto-spark

Hello,I'm trying to install/update the library com.microsoft.azure.kusto:kusto-spark_3.0_2.12:3.1.xTried to install with Maven central repository and using Terraform.It was working previously and now the installation always ends with error:│ Error: c...

  • 3946 Views
  • 5 replies
  • 2 kudos
Latest Reply
phisolani
New Contributor II
  • 2 kudos

I have the same problem with a slightly different version of the connector (change on the minor version). I have a job that runs every hour and specifically, this started to happen on the 23rd of January onwards. The error indeed does say the same:Ru...

  • 2 kudos
4 More Replies
Gandham
by New Contributor II
  • 3451 Views
  • 3 replies
  • 2 kudos

Maven Libraries are failing on restarting the cluster.

I have installed "com.databricks:spark-xml_2.12:0.16.0" maven libraries to a cluster. The installation was successful. But when I restart the cluster, even this successful installation becomes failed. This happens with all Maven Libraries. Here is th...

  • 3451 Views
  • 3 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

it is intermittent issue, we also faced this issue earlier ,try to upgrade DBR version

  • 2 kudos
2 More Replies
LPlates
by New Contributor III
  • 10939 Views
  • 2 replies
  • 1 kudos

Resolved! How do you read an Excel spreadsheet with Databricks

My cluster has Scala 2.12I've installed Maven Library com.crealytics:spark-excel_2.12:0.14.0I get an error java.lang.IllegalStateException: Cannot get a STRING value from a NUMERIC cellwhen trying to execute the following%pythonexcelFileName="/mnt/dl...

  • 10939 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Another way also help for your case is usign Pandas to read excel then convert Pandas Dataframe to Pyspark Dataframe

  • 1 kudos
1 More Replies
Lars_J
by New Contributor
  • 1687 Views
  • 2 replies
  • 0 kudos

Databricks-jdbc and vulnerabilities CVE-2022-42004, CVE-2022-42003

The latest version of Databricks-jdbc available through Maven (2.6.29) now has these two vulnerabilities:https://nvd.nist.gov/vuln/detail/CVE-2022-42004https://nvd.nist.gov/vuln/detail/CVE-2022-42003All due to depending on and including in the jar th...

  • 1687 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Lars Joreteg​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
Bit-Warrior
by New Contributor
  • 626 Views
  • 0 replies
  • 0 kudos

Installing System ML on the cluster

I am trying to install the systemml package from Maven, I ignored the librarieslog4j:log4j, com:sun.jdmk, com:sun.jmx, javax:jmsBut when I run one command of systemml, then spark/databricks can no longer select from tables, effectively breaking somet...

  • 626 Views
  • 0 replies
  • 0 kudos
antoniodavideca
by New Contributor III
  • 2402 Views
  • 2 replies
  • 0 kudos

Jobs REST Api - Create new Job with a new Cluster, and install a Maven Library on the Cluster

I would need to use the Job REST API to create a Job on our databrick Cluster.At the Job Creation, is possible to specify an existing cluster, or, create a new one.I can forward alot of information to the Cluster, but what I would like to specify is ...

  • 2402 Views
  • 2 replies
  • 0 kudos
Latest Reply
Prabakar
Databricks Employee
  • 0 kudos

@Antonio Davide Cali​ You can use the existing cluster in your json to use it for the job.To update or push libraries to the job, you can use the JobsUpdate API. As you want to push libraries to the cluster, you can push them using the new setting an...

  • 0 kudos
1 More Replies
Michael_Galli
by Contributor III
  • 3460 Views
  • 4 replies
  • 2 kudos

Resolved! Unittest in PySpark - how to read XML with Maven com.databricks.spark.xml ?

When writing unit tests with unittest / pytest in PySpark, reading mockup datasources with built-in datatypes like csv, json (spark.read.format("json")) works just fine.But when reading XML´s with spark.read.format("com.databricks.spark.xml") in the ...

  • 3460 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Please install spark-xml from Maven. As it is from Maven you need to install it for cluster which you are using in cluster settings (alternatively using API or CLI)https://mvnrepository.com/artifact/com.databricks/spark-xml

  • 2 kudos
3 More Replies
Labels