i copied my question from an very old question/post that i reponded. and decided to move it to here:
context:
- I have jar (scala), using scala pureconfig (wrapper of typesafe config)
- uploaded an application.conf file to a path which is mounted to the workspace.
- i've tested the jar logic via notebook already (works)
- move to non-notebook approach (this case, airflow submit the api call; either using spark_submit_task or spark_jar_task) both have failrues. see details below.
I've tried using below to be either /dbfs/mnt/blah path or dbfs:/mnt/blah path
in either spark_submit_task or spark_jar_task (via cluster spark_conf for java optinos); no success.
spark.driver.extraJavaOptions
NOTE: TESTING VIA NOTEBOOK using the extraJavaOptions had no problems. (but we did notice, in the notebook,
below command would not succeed unless we try to ls the parent folders 1 by 1 first.
ls /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf
cat /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf
see below snippet that airflow uses;
spark_submit_task= {
"parameters": [
"--class", "com.source2sea.glue.GlueMain",
"--conf", f"spark.driver.extraJavaOptions={java_option_d_config_file}",
"--files", conf_path,
jar_full_path, MY-PARAMETERS
]
}
In my scala code i have code like this (use pureConfig, which is a wrapper of typeSafeConfig, ensured this is done: https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shape...),
val source = defaultOverrides.withFallback(defaultApplication).withFallback(defaultReference)
def read(source: ConfigObjectSource): Either[Throwable, AppConfig] = {
implicit def hint[A] = ProductHint[A](ConfigFieldMapping(CamelCase, CamelCase))
logger.debug(s"Loading configuration ${source.config()}")
val original: Either[ConfigReaderFailures, AppConfig] = source.load[AppConfig]
logger.info(s"Loaded and casted configuration ${original}")
original.leftMap[Throwable](ConfigReaderException.apply)
}
error log
23/04/25 13:45:49 INFO AppConfig$: Loaded and casted configuration Left(ConfigReaderFailures(ThrowableFailure(shaded.com.typesafe.config.ConfigException$IO: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory),Some(ConfigOrigin(dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf)))))
23/04/25 13:45:49 ERROR GlueMain$: Glue failure
pureconfig.error.ConfigReaderException: Cannot convert configuration to a scala.runtime.Nothing$. Failures are:
- (dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf) dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory).
or
23/04/25 12:46:10 INFO AppConfig$: Loaded and casted configuration Left(ConfigReaderFailures(ThrowableFailure(shaded.com.typesafe.config.ConfigException$IO: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory),Some(ConfigOrigin(/dbfs/mnt/glue-artifacts/conf-staging-env/application.conf)))))
23/04/25 12:46:10 ERROR GlueMain$: Glue failure
pureconfig.error.ConfigReaderException: Cannot convert configuration to a scala.runtime.Nothing$. Failures are:
- (/dbfs/mnt/glue-artifacts/conf-staging-env/application.conf) /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory).
at com.source2sea.glue.config.AppConfig$.$anonfun$read$2(AppConfig.scala:31)
Please help to answer how to get this working;