Hi @Tyler Tamasauckas ,
I was also facing same issue with the sql functions 'upper' and ‘hash’.
In the jar we have to call SparkSession.builder().getOrCreate() or SparkContext.getOrCreate() API to get the spark/sparkcontext instance.
In the jar if we use object and main() method approach, upon using for the first time it works fine, later on it is somehow .. strangely losing the instance. Don't know the exact reason for that.
The work around is to use “object .. extends App” approach in the jar, then it is working.
The App trait approach is taking 10 seconds more time when compared to object with main method. This is for the first time only, that too for the first activity. It is because the App trait uses delayed initialization feature. Applies to all Scala Applications.
If we still need to use main method approach, define spark instance as implicit and use that implicit wherever we use that instance.
e.g.
object SomeName {
def UserDefinedMethod(query:String)(implicit spark:SparkSession) = {spark.sql(query)} // This UserDefinedMethod gets spark implicitly.
def main(args: Array[String]): Unit = {
implicit val spark = SparkSession.builder().getOrCreate()
spark…
}
}
Note: Object extends App will get the arguments from Scala 2.9 onward.