cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Trigger.AvailableNow on scala - compile issue

emanuele_maffeo
New Contributor III

Hi everybody,

Trigger.AvailableNow is released within the databricks 10.1 runtime and we would like to use this new feature with autoloader.

We write all our data pipeline in scala and our projects import spark as a provided dependency. If we try to switch to the 3.2.0 spark version (which databricks 10.1 is based on), we cannot compile our code since Trigger.AvailableOnce is not in this release (at least for the spark open source version). Looking into the github repository seems like this functionality will be released with spark 3.3.

Do we have to wait until the spark 3.3 release?

1 ACCEPTED SOLUTION

Accepted Solutions

That's fair.

Anyway this feature is basically backported from spark 3.3.0, but since spark 3.3.0 has not been released yet I cannot use it because my code won't compile, hence my whole development process won't work.

In the meantime I've found a ugly hack (using reflection) that allow me to avoid this issue:

val clazz   = Class.forName("org.apache.spark.sql.streaming.Trigger")
    val method  = clazz.getMethod("AvailableNow")
    val trigger = method.invoke(null).asInstanceOf[Trigger]
 
    val streamWriter = df.writeStream
      .format("delta")
      .options(config.sparkWriteOptions)
      .trigger(trigger)

Anyway I guess that this is something that needs to be addressed somehow, in the future there may be other backported features where this workaround won't work.

View solution in original post

6 REPLIES 6

Anonymous
Not applicable

You can switch to python. Depending on what you're doing and if you're using UDFs, there shouldn't be any difference at all in terms of performance.

Anonymous
Not applicable

Also, it does look like it's available in scala in 10.1 from the release notes

https://docs.databricks.com/release-notes/runtime/10.1.html#triggeravailablenow-for-delta-source-str...

Yes, it's available in scala, if I use a scala notebook. But what if I develop my code on a IDE and deploy it to databricks using CD pipelines? is there any chance to have the databricks runtime packaged as jar so that I can use it as a sbt dependency?

Anonymous
Not applicable

Many things don't work in an IDE such as dbutils and some delta lake features.

We don't release the source code as jars because if we did that AWS would package it and sell it.

That's fair.

Anyway this feature is basically backported from spark 3.3.0, but since spark 3.3.0 has not been released yet I cannot use it because my code won't compile, hence my whole development process won't work.

In the meantime I've found a ugly hack (using reflection) that allow me to avoid this issue:

val clazz   = Class.forName("org.apache.spark.sql.streaming.Trigger")
    val method  = clazz.getMethod("AvailableNow")
    val trigger = method.invoke(null).asInstanceOf[Trigger]
 
    val streamWriter = df.writeStream
      .format("delta")
      .options(config.sparkWriteOptions)
      .trigger(trigger)

Anyway I guess that this is something that needs to be addressed somehow, in the future there may be other backported features where this workaround won't work.

Kaniz
Community Manager
Community Manager

Hi @Emanuele Maffeo​ , Thank you for sharing the "HACK".

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.