<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Is there a way to validate the values of spark configs? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25233#M17525</link>
    <description>&lt;P&gt;We can set for example:&lt;/P&gt;&lt;P&gt;spark.conf.set('aaa.test.junk.config', 99999) , and then run spark.conf.get("aaa.test.junk.config”) which will return a value.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The problem occurs when incorrectly setting to a similar matching property.&lt;/P&gt;&lt;P&gt;spark.conf.set('spark.sql.shuffle.partition', 999)&amp;nbsp;==&amp;gt; without the trailing ’s'&lt;/P&gt;&lt;P&gt;Where the actual property is:&amp;nbsp;‘spark.sql.shuffle.partition&lt;B&gt;s&lt;/B&gt;'&amp;nbsp;==&amp;gt; has a training ’s’&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Running spark.conf.get('spark.sql.shuffle.partition’)&amp;nbsp;will return a value&amp;nbsp;==&amp;gt; without the trailing ’s'&lt;/P&gt;&lt;P&gt;I thought I could run the getAll() as a validation, but the getAll() may not return properties that are explicitly defined in a Notebook session.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there a way to check if what I have used as config parameter is actually valid or not? I don't see any error message either&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 10 Jun 2021 21:54:43 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2021-06-10T21:54:43Z</dc:date>
    <item>
      <title>Is there a way to validate the values of spark configs?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25233#M17525</link>
      <description>&lt;P&gt;We can set for example:&lt;/P&gt;&lt;P&gt;spark.conf.set('aaa.test.junk.config', 99999) , and then run spark.conf.get("aaa.test.junk.config”) which will return a value.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The problem occurs when incorrectly setting to a similar matching property.&lt;/P&gt;&lt;P&gt;spark.conf.set('spark.sql.shuffle.partition', 999)&amp;nbsp;==&amp;gt; without the trailing ’s'&lt;/P&gt;&lt;P&gt;Where the actual property is:&amp;nbsp;‘spark.sql.shuffle.partition&lt;B&gt;s&lt;/B&gt;'&amp;nbsp;==&amp;gt; has a training ’s’&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Running spark.conf.get('spark.sql.shuffle.partition’)&amp;nbsp;will return a value&amp;nbsp;==&amp;gt; without the trailing ’s'&lt;/P&gt;&lt;P&gt;I thought I could run the getAll() as a validation, but the getAll() may not return properties that are explicitly defined in a Notebook session.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there a way to check if what I have used as config parameter is actually valid or not? I don't see any error message either&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 10 Jun 2021 21:54:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25233#M17525</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-06-10T21:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a way to validate the values of spark configs?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25234#M17526</link>
      <description>&lt;P&gt;You could check the list of &lt;A href="https://spark.apache.org/docs/latest/configuration.html" alt="https://spark.apache.org/docs/latest/configuration.html" target="_blank"&gt;valid configurations here&lt;/A&gt; and ensure that there are no typos&lt;/P&gt;</description>
      <pubDate>Thu, 17 Jun 2021 18:19:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25234#M17526</guid>
      <dc:creator>sajith_appukutt</dc:creator>
      <dc:date>2021-06-17T18:19:20Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a way to validate the values of spark configs?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25235#M17527</link>
      <description>&lt;P&gt;You would solve this just like we solve this problem for all lose string references. Namely, that is to create a constant that represents the key-value you want to ensure doesn't get mistyped.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Naturally, if you type it wrong the first time, it will be wrong everywhere, but that is true for all software development. Beyond that, a simple assert will avoid regressions where someone might change your value.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The good news is that this has already been done for you and you could simply include it in your code if you wanted to as seen here:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Screenshot_95"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/2501i89ED1C36CB30A80C/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot_95" alt="Screenshot_95" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you are using Python, I would go ahead and create your own constants and assertions give that integrating with the underlying Scala code just wouldn't be worth it (in my opinion)&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jul 2021 21:38:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-way-to-validate-the-values-of-spark-configs/m-p/25235#M17527</guid>
      <dc:creator>User16857281974</dc:creator>
      <dc:date>2021-07-30T21:38:33Z</dc:date>
    </item>
  </channel>
</rss>

