cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

qwerty1
by Contributor
  • 4834 Views
  • 7 replies
  • 17 kudos

Resolved! When will databricks runtime be released for Scala 2.13?

I see that spark fully supports Scala 2.13. I wonder why is there no databricks runtime with Scala 2.13 yet. Any plans on making this available? It would be super useful.

  • 4834 Views
  • 7 replies
  • 17 kudos
Latest Reply
guersam
New Contributor
  • 17 kudos

I agree with @777. As Scala 3 is getting mature and there are more real use cases with Scala 3 on Spark now, support for Scala 2.13 will be valuable to users including us.I think the recent upgrade of Databricks runtime from JDK 8 to 17 was one of a ...

  • 17 kudos
6 More Replies
swzzzsw
by New Contributor III
  • 4272 Views
  • 6 replies
  • 0 kudos

Resolved! SQLServerException: deadlock

I'm using databricks to connect to a SQL managed instance via JDBC. SQL operations I need to perform include DELETE, UPDATE, and simple read and write. Since spark syntax only handles simple read and write, I had to open SQL connection using Scala an...

image.png
  • 4272 Views
  • 6 replies
  • 0 kudos
Latest Reply
Panda
Valued Contributor
  • 0 kudos

@swzzzsw Since you are performing database operations, to reduce the chances of deadlocks, make sure to wrap your SQL operations inside transactions using commit and rollback.Another approachs to consider is adding retry logic or using Isolation Leve...

  • 0 kudos
5 More Replies
Mohit_m
by Valued Contributor II
  • 22981 Views
  • 3 replies
  • 4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

  • 22981 Views
  • 3 replies
  • 4 kudos
Latest Reply
Bruno-Castro
New Contributor II
  • 4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

  • 4 kudos
2 More Replies
YSDPrasad
by New Contributor III
  • 5555 Views
  • 3 replies
  • 3 kudos

Resolved! NoClassDefFoundError: scala/Product$class

import com.microsoft.azure.sqldb.spark.config.Configimport com.microsoft.azure.sqldb.spark.connect._import com.microsoft.azure.sqldb.spark.query._val query = "Truncate table tablename"val config = Config(Map( "url"     -> dbutils.secrets.get(scope = ...

  • 5555 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@Someswara Durga Prasad Yaralgadda​ :The NoClassDefFoundError error occurs when a class that was available during the compile time is not available at the runtime. This could be due to a few reasons, including a missing dependency or an incompatible ...

  • 3 kudos
2 More Replies
schnee1
by New Contributor III
  • 8636 Views
  • 8 replies
  • 0 kudos

Access struct elements inside dataframe?

I have JSON data set that contains a price in a string like "USD 5.00". I'd like to convert the numeric portion to a Double to use in an MLLIB LabeledPoint, and have managed to split the price string into an array of string. The below creates a data...

  • 8636 Views
  • 8 replies
  • 0 kudos
Latest Reply
goldentriangle
New Contributor II
  • 0 kudos

Thanks, Golden Triangle Tour

  • 0 kudos
7 More Replies
AryaMa
by New Contributor III
  • 25730 Views
  • 13 replies
  • 8 kudos

Resolved! reading data from url using spark

reading data form url using spark ,community edition ,got a path related error ,any suggestions please ? url = "https://raw.githubusercontent.com/thomaspernet/data_csv_r/master/data/adult.csv" from pyspark import SparkFiles spark.sparkContext.addFil...

  • 25730 Views
  • 13 replies
  • 8 kudos
Latest Reply
padang
New Contributor II
  • 8 kudos

Sorry, bringing this back up...​from pyspark import SparkFiles url = "http://raw.githubusercontent.com/ltregan/ds-data/main/authors.csv" spark.sparkContext.addFile(url) df = spark.read.csv("file://"+SparkFiles.get("authors.csv"), header=True, inferSc...

  • 8 kudos
12 More Replies
Sandesh87
by New Contributor III
  • 934 Views
  • 1 replies
  • 2 kudos

apply a function across multiple smaller dataframes created from one big dataframe in scala

The dataframe 'big_df' looks like the below| id| index| timestamp||:---- |:------:| -----:|| abc| 1| 11:00:00|| abc| 1| 11:00:10|| abc| 1| 11:00:20|| abc| 1| 11:00:30|| abc| 1| 11:00:40|| abc| 1| 11:00:50|| abc| 2| 11:01:00|| abc| 2| 11:01:10|| abc| ...

  • 934 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sandesh Puligundla​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
swatish0395
by New Contributor III
  • 3117 Views
  • 3 replies
  • 4 kudos

Resolved! how to create a scala jar using db notebook and save it in a file path inside databricks

I have scala function as below, i am unable to understand how to write a scala jar with the same, please find below code i have used Enforcing Column-Level Encryption - Databrick %scala import com.macasaet.fernet.{Key, StringValidator, Token}import o...

  • 3117 Views
  • 3 replies
  • 4 kudos
Latest Reply
swatish0395
New Contributor III
  • 4 kudos

I had to finally create the jar using teh intellij and sbt iconfiguration on the same env. and then installed the jar in the cluster it worked

  • 4 kudos
2 More Replies
Pawan1
by New Contributor II
  • 1702 Views
  • 1 replies
  • 2 kudos

Your administrator has forbidden Scala UDFs from being run on this cluster. How to enable access to Scala UDF on Azure Databricks cluster ?

Hi All,When i try to run a scala UDF in Azuredatabricks 10.1 (includes Apache Spark 3.2.0, Scala 2.12) cluster i was able to run the udf. However when i tried to run the same notebook in 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12) cluster i ha...

  • 1702 Views
  • 1 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi, Are you trying this with High concurrency clusters? Also, please tag @Debayan Mukherjee​ with your next response so that I will get notified.

  • 2 kudos
gud4eve
by New Contributor III
  • 2444 Views
  • 1 replies
  • 0 kudos

Resolved! Scala app getting NullPointerException while migrating from DBR 7.3 to 9.1 (and above)

We are migrating our Scala jobs from AWS EMR (6.2.1 and Spark version - 3.0.1) to Lakehouse and few of our jobs are failing due to NullPointerException. We tried in Databricks Runtime 7.3 LTS, it is working fine. Because it had same spark version 3.0...

  • 2444 Views
  • 1 replies
  • 0 kudos
Latest Reply
gud4eve
New Contributor III
  • 0 kudos

In one of my code statements, I updated scala Boolean to java.lang.Boolean and this is working fine now. May be in new newer Spark versions, null in scala Boolean isn't supported.

  • 0 kudos
Databrickguy
by New Contributor II
  • 1109 Views
  • 1 replies
  • 0 kudos

How to use Java MaskFormatter in sparksql?

I create a function based on Java MaskFormatter function in Databricks/Scala.But when I call it from sparksql, I received error messageError in SQL statement: AnalysisException: Undefined function: formatAccount. This function is neither a built-in/t...

  • 1109 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Tim zhang​ :The issue is that the formatAccount function is defined as a Scala function, but SparkSQL is looking for a SQL function. You need to register the Scala function as a SQL function so that it can be called from SparkSQL. You can register t...

  • 0 kudos
bchaubey
by Contributor II
  • 3634 Views
  • 1 replies
  • 0 kudos

unable to connect with Azure Storage with Scala

Hi Team, I am unable to connect Storage account with scala in Databricks, getting bellow error.AbfsRestOperationException: Status code: -1 error code: null error message: Cannot resolve hostname: ptazsg5gfcivcrstrlrs.dfs.core.windows.netCaused by: Un...

  • 3634 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Bhagwan Chaubey​ :The error message suggests that the hostname for your Azure Storage account could not be resolved. This could happen if there is a network issue, or if the hostname is incorrect.Here are some steps you can try to resolve the issue:...

  • 0 kudos
jerry-xu-sa
by New Contributor II
  • 2494 Views
  • 2 replies
  • 1 kudos

Order of a dataframe is not perserved after calling cache() and limit()

Here are the simple steps to reproduce it. Note that col "foo" and "bar" are just redundant cols to make sure the dataframe doesn't fit into a single partition. // generate a random df val rand = new scala.util.Random val df = (1 to 3000).map(i => (r...

  • 2494 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Jerry Xu​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wil...

  • 1 kudos
1 More Replies
Labels