<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I run a Pyspark python script in a scala environment in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/how-can-i-run-a-pyspark-python-script-in-a-scala-environment/m-p/30313#M718</link>
    <description>&lt;P&gt;The error in the attachment​&lt;/P&gt;&lt;P&gt;​&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/2140i9BD021B5B599A797/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 30 Jan 2022 02:45:44 GMT</pubDate>
    <dc:creator>177331</dc:creator>
    <dc:date>2022-01-30T02:45:44Z</dc:date>
    <item>
      <title>How can I run a Pyspark python script in a scala environment</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/how-can-i-run-a-pyspark-python-script-in-a-scala-environment/m-p/30312#M717</link>
      <description>&lt;P&gt;I need to use both Python Spark code and Scala Spark code in my project. A lot of project configuration is written in Scala part, and I want to generate the data from Scala and pass the data path to my Python script. Then I can use the Python ecosystem to train models etc and generate a dataset as a result.  Then Scala will read the result and pass it into our downstream system.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, when I test the code below, I met some issues. Am I wrong on anything? Is there any better way to achieve my goal?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cmd 2 running print hello script works well&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cmd 4 running Pyspark python script produce such error &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Error: Could not find or load main class org.apache.spark.launcher.Main/databricks/spark/bin/spark-class: line 101: CMD: bad array subscriptTraceback (most recent call last):  File "/tmp/cli.py", line 23, in &amp;lt;module&amp;gt;    cli.main(sys.argv[1:], standalone_mode=False)  File "/databricks/python3/lib/python3.8/site-packages/click/core.py", line 1053, in main    rv = self.invoke(ctx)  File "/databricks/python3/lib/python3.8/site-packages/click/core.py", line 1395, in invoke    return ctx.invoke(self.callback, **ctx.params)  File "/databricks/python3/lib/python3.8/site-packages/click/core.py", line 754, in invoke    return __callback(*args, **kwargs)  File "/tmp/cli.py", line 19, in cli    spark = SparkSession.builder.getOrCreate()  File "/databricks/spark/python/pyspark/sql/session.py", line 229, in getOrCreate    sc = SparkContext.getOrCreate(sparkConf)  File "/databricks/spark/python/pyspark/context.py", line 392, in getOrCreate    SparkContext(conf=conf or SparkConf())  File "/databricks/spark/python/pyspark/context.py", line 145, in __init__    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)  File "/databricks/spark/python/pyspark/context.py", line 339, in _ensure_initialized    SparkContext._gateway = gateway or launch_gateway(conf)  File "/databricks/spark/python/pyspark/java_gateway.py", line 108, in launch_gateway    raise Exception("Java gateway process exited before sending its port number")Exception: Java gateway process exited before sending its port number1&lt;/P&gt;&lt;P&gt;stdout: java.io.PrintStream@2202fa90&lt;/P&gt;&lt;P&gt;stderr: java.io.PrintStream@4133a68d&lt;/P&gt;&lt;P&gt;import sys.process._&lt;/P&gt;&lt;P&gt;callPythonCli: ()Unit&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 30 Jan 2022 02:43:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/how-can-i-run-a-pyspark-python-script-in-a-scala-environment/m-p/30312#M717</guid>
      <dc:creator>177331</dc:creator>
      <dc:date>2022-01-30T02:43:27Z</dc:date>
    </item>
    <item>
      <title>Re: How can I run a Pyspark python script in a scala environment</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/how-can-i-run-a-pyspark-python-script-in-a-scala-environment/m-p/30313#M718</link>
      <description>&lt;P&gt;The error in the attachment​&lt;/P&gt;&lt;P&gt;​&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/2140i9BD021B5B599A797/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 30 Jan 2022 02:45:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/how-can-i-run-a-pyspark-python-script-in-a-scala-environment/m-p/30313#M718</guid>
      <dc:creator>177331</dc:creator>
      <dc:date>2022-01-30T02:45:44Z</dc:date>
    </item>
  </channel>
</rss>

