cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Trigger.AvailableNow on scala - compile issue

emanuele_maffeo
New Contributor III

Hi everybody,

Trigger.AvailableNow is released within the databricks 10.1 runtime and we would like to use this new feature with autoloader.

We write all our data pipeline in scala and our projects import spark as a provided dependency. If we try to switch to the 3.2.0 spark version (which databricks 10.1 is based on), we cannot compile our code since Trigger.AvailableOnce is not in this release (at least for the spark open source version). Looking into the github repository seems like this functionality will be released with spark 3.3.

Do we have to wait until the spark 3.3 release?

1 ACCEPTED SOLUTION

Accepted Solutions

That's fair.

Anyway this feature is basically backported from spark 3.3.0, but since spark 3.3.0 has not been released yet I cannot use it because my code won't compile, hence my whole development process won't work.

In the meantime I've found a ugly hack (using reflection) that allow me to avoid this issue:

val clazz   = Class.forName("org.apache.spark.sql.streaming.Trigger")
    val method  = clazz.getMethod("AvailableNow")
    val trigger = method.invoke(null).asInstanceOf[Trigger]
 
    val streamWriter = df.writeStream
      .format("delta")
      .options(config.sparkWriteOptions)
      .trigger(trigger)

Anyway I guess that this is something that needs to be addressed somehow, in the future there may be other backported features where this workaround won't work.

View solution in original post

5 REPLIES 5

Anonymous
Not applicable

You can switch to python. Depending on what you're doing and if you're using UDFs, there shouldn't be any difference at all in terms of performance.

Anonymous
Not applicable

Also, it does look like it's available in scala in 10.1 from the release notes

https://docs.databricks.com/release-notes/runtime/10.1.html#triggeravailablenow-for-delta-source-str...

Yes, it's available in scala, if I use a scala notebook. But what if I develop my code on a IDE and deploy it to databricks using CD pipelines? is there any chance to have the databricks runtime packaged as jar so that I can use it as a sbt dependency?

Anonymous
Not applicable

Many things don't work in an IDE such as dbutils and some delta lake features.

We don't release the source code as jars because if we did that AWS would package it and sell it.

That's fair.

Anyway this feature is basically backported from spark 3.3.0, but since spark 3.3.0 has not been released yet I cannot use it because my code won't compile, hence my whole development process won't work.

In the meantime I've found a ugly hack (using reflection) that allow me to avoid this issue:

val clazz   = Class.forName("org.apache.spark.sql.streaming.Trigger")
    val method  = clazz.getMethod("AvailableNow")
    val trigger = method.invoke(null).asInstanceOf[Trigger]
 
    val streamWriter = df.writeStream
      .format("delta")
      .options(config.sparkWriteOptions)
      .trigger(trigger)

Anyway I guess that this is something that needs to be addressed somehow, in the future there may be other backported features where this workaround won't work.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group