cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Connect and DBR 16.4 LTS @ Scala 2.13

dollyb
Contributor II

Hi there,

we're running Scala jobs on Databricks and I was eager to finally upgrade to Scala 2.13. However, Databricks Connect 16.4.x doesn't handle Scala versioning, so all dependencies are tied to Scala 2.13. It's rather tedious to exclude all 2.12 dependencies. I'm also encountering the issue that the project wants to use Scala 2.13.16 (the latest version), but DBR 16.4 runs using Scala 2.13.10.

Will there be a Scala 2.13-specific release of Databricks Connect? Or is there a different way of running my project?

 

2 ACCEPTED SOLUTIONS

Accepted Solutions

BigRoux
Databricks Employee
Databricks Employee

Here is some information to consider:

 

Databricks Connect currently does not fully support Scala 2.13. The support for Scala 2.13 in Databricks Connect is planned for its version associated with DBR 17.0, expected in early June 2025. Databricks Connect's dependencies and compatibility requirements necessitate careful version alignment between Scala versions used in development and runtime configurations.
For alternative solutions to running your project with Scala 2.13 on DBR 16.4, you can consider the following approaches:
  1. Use Direct Cluster Execution: Since DBR 16.4 supports both Scala 2.12 and 2.13, you can deploy jobs directly on the cluster using the Scala 2.13 runtime.
  2. Address Dependency Conflicts Manually: As suggested in migration guides, updating build files to resolve dependency conflicts between Scala 2.12 and 2.13 may involve manual adjustments such as excluding incompatible libraries or adding library shading techniques.
  3. Cross-Build Projects: If immediate Databricks Connect compatibility is crucial, cross-compile your code with Scala 2.12 to maintain compatibility while leveraging DBR 16.4's flexibility to run Scala 2.13-based tests.
These measures can help mitigate dependency management issues and reduce manual exclusions until full support for Scala 2.13 is released in Databricks Connect.
 
Cheers, Lou.

View solution in original post

Hi Lou,

thanks for your detailed answer!

So far I'm trying option 2, but exchanging so many dependencies seems like a strange thing to do and feels like a g-ame of whack-a-mole as new build errors pop up. If the dependency situations gets resolved with DBR 17, it might be acceptable.

I guess I'll try option 3 as well and cross-build my project.

What do you mean with option 1? Not using Databricks Connect at all? In that case, do I just use normal Spark (3.5.2) dependencies to run my job on the cluster? Do I use Spark Connect's SparkSession or plain old SparkContext?

 

View solution in original post

4 REPLIES 4

BigRoux
Databricks Employee
Databricks Employee

Here is some information to consider:

 

Databricks Connect currently does not fully support Scala 2.13. The support for Scala 2.13 in Databricks Connect is planned for its version associated with DBR 17.0, expected in early June 2025. Databricks Connect's dependencies and compatibility requirements necessitate careful version alignment between Scala versions used in development and runtime configurations.
For alternative solutions to running your project with Scala 2.13 on DBR 16.4, you can consider the following approaches:
  1. Use Direct Cluster Execution: Since DBR 16.4 supports both Scala 2.12 and 2.13, you can deploy jobs directly on the cluster using the Scala 2.13 runtime.
  2. Address Dependency Conflicts Manually: As suggested in migration guides, updating build files to resolve dependency conflicts between Scala 2.12 and 2.13 may involve manual adjustments such as excluding incompatible libraries or adding library shading techniques.
  3. Cross-Build Projects: If immediate Databricks Connect compatibility is crucial, cross-compile your code with Scala 2.12 to maintain compatibility while leveraging DBR 16.4's flexibility to run Scala 2.13-based tests.
These measures can help mitigate dependency management issues and reduce manual exclusions until full support for Scala 2.13 is released in Databricks Connect.
 
Cheers, Lou.

Hi Lou,

thanks for your detailed answer!

So far I'm trying option 2, but exchanging so many dependencies seems like a strange thing to do and feels like a g-ame of whack-a-mole as new build errors pop up. If the dependency situations gets resolved with DBR 17, it might be acceptable.

I guess I'll try option 3 as well and cross-build my project.

What do you mean with option 1? Not using Databricks Connect at all? In that case, do I just use normal Spark (3.5.2) dependencies to run my job on the cluster? Do I use Spark Connect's SparkSession or plain old SparkContext?

 

BigRoux
Databricks Employee
Databricks Employee

Runtime 17.0 is out in beta right now and I expect it to GA in the near future. Keep an eye out for the runtime release notes. Once they are released you will be able to see what's present and hopefully (fingers crossed) your dependency issue will be resolved.  https://docs.databricks.com/aws/en/release-notes/runtime

I'm not sure we can migrate to 17 before it reaches LTS status, and there's also Spark 4.0 to migrate to. So I'd like to just use Scala 2.13 on 16.4 for now. It seems that getting rid of Databricks Connect solves my dependency issues and also issues with Databricks overrides of Spark and Delta APIs.