<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SparkContext lost when running %sh script.py in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67242#M33307</link>
    <description>&lt;P&gt;Thanks for you answer. Thats also how i understand it. But is there a way to inject or connect to the pre-configured Spark session from within the Python script (.py file)?&lt;/P&gt;</description>
    <pubDate>Thu, 25 Apr 2024 08:55:15 GMT</pubDate>
    <dc:creator>madrhr</dc:creator>
    <dc:date>2024-04-25T08:55:15Z</dc:date>
    <item>
      <title>SparkContext lost when running %sh script.py</title>
      <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67180#M33299</link>
      <description>&lt;P&gt;I need to execute a .py file in Databricks from a notebook (with arguments which for simplicity i exclude here). For this i am using:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;%sh script.py&lt;/LI-CODE&gt;&lt;P&gt;script.py:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark import SparkContext

def main():
    sc = SparkContext.getOrCreate()
    print(sc)

if __name__ == "__main__":
    main()&lt;/LI-CODE&gt;&lt;P&gt;However, i need SparkContext in .py file and its suggested to use&amp;nbsp;&lt;SPAN&gt;SparkContext.&lt;/SPAN&gt;&lt;SPAN&gt;getOrCreate&lt;/SPAN&gt;&lt;SPAN&gt;() but i get the exception that i need to set a master url.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;pyspark.errors.exceptions.base.PySparkRuntimeError: [MASTER_URL_NOT_SET] A master URL must be set in your configuration.&lt;/LI-CODE&gt;&lt;P&gt;&lt;SPAN&gt;But even if i set the master url, i get another exception. Now whats really weird is that if i run the same .py script directly in Databricks using the little play button it works. It also works if i open a web terminal of the cluster und execute my .py script in this bash shell. So using both approaches it works and i get the SparkContext. However this is obvious not very useful. In the %sh shell and in the web shell, user is root, same working directory and the python env is also not the problem.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The cluster i am using is a single node NC24ads_A100, so only a driver node and no additional worker nodes. I running DBR 14.2 ML and Spark 3.5.0.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Would be very happy to know whats so special about %sh or where my problem is or whats a workaround to execute .py files from a databricks notebooks with arguments and while staying/getting SparkContext.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Apr 2024 11:41:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67180#M33299</guid>
      <dc:creator>madrhr</dc:creator>
      <dc:date>2024-04-24T11:41:23Z</dc:date>
    </item>
    <item>
      <title>Re: SparkContext lost when running %sh script.py</title>
      <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67234#M33305</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/104040"&gt;@madrhr&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;I think this occurs because one session is initiated within the Python script (.py file), while in the Databricks notebook, we have a pre-configured Spark session. It is important to note that we cannot use more than one Spark session per notebook, and each session should be unique.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Apr 2024 06:37:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67234#M33305</guid>
      <dc:creator>Yeshwanth</dc:creator>
      <dc:date>2024-04-25T06:37:59Z</dc:date>
    </item>
    <item>
      <title>Re: SparkContext lost when running %sh script.py</title>
      <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67242#M33307</link>
      <description>&lt;P&gt;Thanks for you answer. Thats also how i understand it. But is there a way to inject or connect to the pre-configured Spark session from within the Python script (.py file)?&lt;/P&gt;</description>
      <pubDate>Thu, 25 Apr 2024 08:55:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67242#M33307</guid>
      <dc:creator>madrhr</dc:creator>
      <dc:date>2024-04-25T08:55:15Z</dc:date>
    </item>
    <item>
      <title>Re: SparkContext lost when running %sh script.py</title>
      <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67571#M33369</link>
      <description>&lt;P&gt;I got it eventually working with a combination of:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; databricks.sdk.runtime &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;spark.sparkContext.addPyFile("/path/to/your/file")&lt;BR /&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;sys.path.&lt;/SPAN&gt;&lt;SPAN&gt;append("path/to/your")&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 29 Apr 2024 10:15:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/67571#M33369</guid>
      <dc:creator>madrhr</dc:creator>
      <dc:date>2024-04-29T10:15:37Z</dc:date>
    </item>
    <item>
      <title>Re: SparkContext lost when running %sh script.py</title>
      <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/116929#M45392</link>
      <description>&lt;P&gt;Hey. How do you add arguments to this? My script uses argparse and i want to pass arguments like this --arg_name "value"?&amp;nbsp; &amp;nbsp;Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 29 Apr 2025 07:39:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/116929#M45392</guid>
      <dc:creator>NickGeo</dc:creator>
      <dc:date>2025-04-29T07:39:07Z</dc:date>
    </item>
    <item>
      <title>Re: SparkContext lost when running %sh script.py</title>
      <link>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/148582#M52924</link>
      <description>&lt;P&gt;hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/104040"&gt;@madrhr&lt;/a&gt;&amp;nbsp;..i am checking on this...its not working for me. Can u help with screenshots of implementation&lt;/P&gt;</description>
      <pubDate>Tue, 17 Feb 2026 08:55:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sparkcontext-lost-when-running-sh-script-py/m-p/148582#M52924</guid>
      <dc:creator>ajay_wavicle</dc:creator>
      <dc:date>2026-02-17T08:55:06Z</dc:date>
    </item>
  </channel>
</rss>

