<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic pass application.conf file into databricks jobs in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5294#M1747</link>
    <description>&lt;P&gt;i copied my question from an very old question/post that i reponded. and decided to move it to here:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;context:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I have jar (scala), using scala pureconfig (wrapper of typesafe config)&lt;/LI&gt;&lt;LI&gt;uploaded an application.conf file to a path which is mounted to the workspace.&lt;/LI&gt;&lt;LI&gt;i've tested the jar logic via notebook already (works)&lt;/LI&gt;&lt;LI&gt;move to non-notebook approach (this case, airflow submit the api call; either using spark_submit_task or spark_jar_task) both have failrues. see details below.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I've tried using below to be either /dbfs/mnt/blah path or dbfs:/mnt/blah path&lt;/P&gt;&lt;P&gt;in either &lt;B&gt;spark_submit_task&lt;/B&gt; or &lt;B&gt;spark_jar_task &lt;/B&gt;(via cluster spark_conf for java optinos); no success.&lt;/P&gt;&lt;P&gt;&lt;B&gt;spark.driver.extraJavaOptions&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;NOTE: TESTING VIA NOTEBOOK using the extraJavaOptions had no problems. (but we did notice, in the notebook, &lt;/P&gt;&lt;P&gt;below command would not succeed unless we try to ls the parent folders 1 by 1 first.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;ls /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf
cat /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;see below snippet that airflow uses; &lt;/P&gt;&lt;P&gt; spark_submit_task= {&lt;/P&gt;&lt;P&gt; &lt;B&gt;"parameters"&lt;/B&gt;: [&lt;/P&gt;&lt;P&gt; &lt;B&gt;"--class"&lt;/B&gt;, &lt;B&gt;"com.source2sea.glue.GlueMain"&lt;/B&gt;,&lt;/P&gt;&lt;P&gt; &lt;B&gt;"--conf"&lt;/B&gt;, &lt;B&gt;f"spark.driver.extraJavaOptions={&lt;/B&gt;java_option_d_config_file&lt;B&gt;}"&lt;/B&gt;,&lt;/P&gt;&lt;P&gt; &lt;B&gt;"--files"&lt;/B&gt;, conf_path,&lt;/P&gt;&lt;P&gt; jar_full_path, MY-PARAMETERS&lt;/P&gt;&lt;P&gt;&lt;I&gt; &lt;/I&gt;]&lt;/P&gt;&lt;P&gt;}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In my scala code i have code like this (use pureConfig, which is a wrapper of typeSafeConfig, ensured this is done: &lt;A href="https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shapeless-dependency" alt="https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shapeless-dependency" target="_blank"&gt;https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shapeless-dependency&lt;/A&gt;), &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;val source = defaultOverrides.withFallback(defaultApplication).withFallback(defaultReference)
&amp;nbsp;
def read(source: ConfigObjectSource): Either[Throwable, AppConfig] = {
&amp;nbsp;
  implicit def hint[A] = ProductHint[A](ConfigFieldMapping(CamelCase, CamelCase))
&amp;nbsp;
  logger.debug(s"Loading configuration ${source.config()}")
&amp;nbsp;&amp;nbsp;
  val original: Either[ConfigReaderFailures, AppConfig] = source.load[AppConfig]
&amp;nbsp;
  logger.info(s"Loaded and casted configuration ${original}")
&amp;nbsp;
  original.leftMap[Throwable](ConfigReaderException.apply)
&amp;nbsp;
}&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;error log&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;23/04/25 13:45:49 INFO AppConfig$: Loaded and casted configuration Left(ConfigReaderFailures(ThrowableFailure(shaded.com.typesafe.config.ConfigException$IO: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory),Some(ConfigOrigin(dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf)))))
23/04/25 13:45:49 ERROR GlueMain$: Glue failure
pureconfig.error.ConfigReaderException: Cannot convert configuration to a scala.runtime.Nothing$. Failures are:
  - (dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf) dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory).
&amp;nbsp;
&amp;nbsp;
or
&amp;nbsp;
&amp;nbsp;
23/04/25 12:46:10 INFO AppConfig$: Loaded and casted configuration Left(ConfigReaderFailures(ThrowableFailure(shaded.com.typesafe.config.ConfigException$IO: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory),Some(ConfigOrigin(/dbfs/mnt/glue-artifacts/conf-staging-env/application.conf)))))
23/04/25 12:46:10 ERROR GlueMain$: Glue failure
pureconfig.error.ConfigReaderException: Cannot convert configuration to a scala.runtime.Nothing$. Failures are:
  - (/dbfs/mnt/glue-artifacts/conf-staging-env/application.conf) /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory).
&amp;nbsp;
	at com.source2sea.glue.config.AppConfig$.$anonfun$read$2(AppConfig.scala:31)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Please help to answer how to get this working;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 25 Apr 2023 14:38:08 GMT</pubDate>
    <dc:creator>source2sea</dc:creator>
    <dc:date>2023-04-25T14:38:08Z</dc:date>
    <item>
      <title>pass application.conf file into databricks jobs</title>
      <link>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5294#M1747</link>
      <description>&lt;P&gt;i copied my question from an very old question/post that i reponded. and decided to move it to here:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;context:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I have jar (scala), using scala pureconfig (wrapper of typesafe config)&lt;/LI&gt;&lt;LI&gt;uploaded an application.conf file to a path which is mounted to the workspace.&lt;/LI&gt;&lt;LI&gt;i've tested the jar logic via notebook already (works)&lt;/LI&gt;&lt;LI&gt;move to non-notebook approach (this case, airflow submit the api call; either using spark_submit_task or spark_jar_task) both have failrues. see details below.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I've tried using below to be either /dbfs/mnt/blah path or dbfs:/mnt/blah path&lt;/P&gt;&lt;P&gt;in either &lt;B&gt;spark_submit_task&lt;/B&gt; or &lt;B&gt;spark_jar_task &lt;/B&gt;(via cluster spark_conf for java optinos); no success.&lt;/P&gt;&lt;P&gt;&lt;B&gt;spark.driver.extraJavaOptions&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;NOTE: TESTING VIA NOTEBOOK using the extraJavaOptions had no problems. (but we did notice, in the notebook, &lt;/P&gt;&lt;P&gt;below command would not succeed unless we try to ls the parent folders 1 by 1 first.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;ls /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf
cat /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;see below snippet that airflow uses; &lt;/P&gt;&lt;P&gt; spark_submit_task= {&lt;/P&gt;&lt;P&gt; &lt;B&gt;"parameters"&lt;/B&gt;: [&lt;/P&gt;&lt;P&gt; &lt;B&gt;"--class"&lt;/B&gt;, &lt;B&gt;"com.source2sea.glue.GlueMain"&lt;/B&gt;,&lt;/P&gt;&lt;P&gt; &lt;B&gt;"--conf"&lt;/B&gt;, &lt;B&gt;f"spark.driver.extraJavaOptions={&lt;/B&gt;java_option_d_config_file&lt;B&gt;}"&lt;/B&gt;,&lt;/P&gt;&lt;P&gt; &lt;B&gt;"--files"&lt;/B&gt;, conf_path,&lt;/P&gt;&lt;P&gt; jar_full_path, MY-PARAMETERS&lt;/P&gt;&lt;P&gt;&lt;I&gt; &lt;/I&gt;]&lt;/P&gt;&lt;P&gt;}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In my scala code i have code like this (use pureConfig, which is a wrapper of typeSafeConfig, ensured this is done: &lt;A href="https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shapeless-dependency" alt="https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shapeless-dependency" target="_blank"&gt;https://pureconfig.github.io/docs/faq.html#how-can-i-use-pureconfig-with-spark-210-problematic-shapeless-dependency&lt;/A&gt;), &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;val source = defaultOverrides.withFallback(defaultApplication).withFallback(defaultReference)
&amp;nbsp;
def read(source: ConfigObjectSource): Either[Throwable, AppConfig] = {
&amp;nbsp;
  implicit def hint[A] = ProductHint[A](ConfigFieldMapping(CamelCase, CamelCase))
&amp;nbsp;
  logger.debug(s"Loading configuration ${source.config()}")
&amp;nbsp;&amp;nbsp;
  val original: Either[ConfigReaderFailures, AppConfig] = source.load[AppConfig]
&amp;nbsp;
  logger.info(s"Loaded and casted configuration ${original}")
&amp;nbsp;
  original.leftMap[Throwable](ConfigReaderException.apply)
&amp;nbsp;
}&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;error log&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;23/04/25 13:45:49 INFO AppConfig$: Loaded and casted configuration Left(ConfigReaderFailures(ThrowableFailure(shaded.com.typesafe.config.ConfigException$IO: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory),Some(ConfigOrigin(dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf)))))
23/04/25 13:45:49 ERROR GlueMain$: Glue failure
pureconfig.error.ConfigReaderException: Cannot convert configuration to a scala.runtime.Nothing$. Failures are:
  - (dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf) dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: dbfs:/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory).
&amp;nbsp;
&amp;nbsp;
or
&amp;nbsp;
&amp;nbsp;
23/04/25 12:46:10 INFO AppConfig$: Loaded and casted configuration Left(ConfigReaderFailures(ThrowableFailure(shaded.com.typesafe.config.ConfigException$IO: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory),Some(ConfigOrigin(/dbfs/mnt/glue-artifacts/conf-staging-env/application.conf)))))
23/04/25 12:46:10 ERROR GlueMain$: Glue failure
pureconfig.error.ConfigReaderException: Cannot convert configuration to a scala.runtime.Nothing$. Failures are:
  - (/dbfs/mnt/glue-artifacts/conf-staging-env/application.conf) /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf: java.io.FileNotFoundException: /dbfs/mnt/glue-artifacts/conf-staging-env/application.conf (No such file or directory).
&amp;nbsp;
	at com.source2sea.glue.config.AppConfig$.$anonfun$read$2(AppConfig.scala:31)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Please help to answer how to get this working;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Apr 2023 14:38:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5294#M1747</guid>
      <dc:creator>source2sea</dc:creator>
      <dc:date>2023-04-25T14:38:08Z</dc:date>
    </item>
    <item>
      <title>Re: pass application.conf file into databricks jobs</title>
      <link>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5295#M1748</link>
      <description>&lt;P&gt;I haven't tried with spark-submit, but in my notebooks I use the Filestore for this.&lt;/P&gt;&lt;P&gt;val&amp;nbsp;fileConfig&amp;nbsp;=&amp;nbsp;ConfigFactory.parseFile(&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;new&amp;nbsp;File("/dbfs/FileStore/NotebookConfig/app.conf"))&lt;/P&gt;&lt;P&gt;(this is Typesafe)&lt;/P&gt;&lt;P&gt;You could also add the conf file as internal resource and pack it with the jar.&lt;/P&gt;&lt;P&gt;But of course only interesting if the conf file does not change a lot, otherwise you would need to build new jars for every change.&lt;/P&gt;</description>
      <pubDate>Tue, 25 Apr 2023 14:54:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5295#M1748</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2023-04-25T14:54:03Z</dc:date>
    </item>
    <item>
      <title>Re: pass application.conf file into databricks jobs</title>
      <link>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5296#M1749</link>
      <description>&lt;P&gt;we had to put the conf in the root folder of the mounted path, and that works.&lt;/P&gt;&lt;P&gt;maybe the mounted storage account being blob instead of adls2 is causing the issues.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Apr 2023 13:29:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pass-application-conf-file-into-databricks-jobs/m-p/5296#M1749</guid>
      <dc:creator>source2sea</dc:creator>
      <dc:date>2023-04-28T13:29:27Z</dc:date>
    </item>
  </channel>
</rss>

