<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: pytest error in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70599#M7332</link>
    <description>&lt;P&gt;Hi,&lt;BR /&gt;&lt;BR /&gt;The error message and stack trace doesn't seem to suggest that this is a failed pytest issue. If that's true, can you please try to replicate what's being invoked from `src\tests\test_calculate_psi_for_each_column.py` in a Python REPL?&lt;BR /&gt;&lt;BR /&gt;And separately, have you confirmed that the pyspark installation was successful? Depending on how you installed pyspark, are you able to start pyspark by itself (independent of `findspark`) such as `$SPARK_HOME/bin/pyspark`?&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;/P&gt;</description>
    <pubDate>Fri, 24 May 2024 17:46:03 GMT</pubDate>
    <dc:creator>brockb</dc:creator>
    <dc:date>2024-05-24T17:46:03Z</dc:date>
    <item>
      <title>pytest error</title>
      <link>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70595#M7331</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have a quick question. If my source code call pysark collect() or any method related to rdd methods, then pytest on my local PC will report the following error. My local machine doesn't have any specific setting for pyspark and I used findspark package. If you know the solution it will be greatly appreciated. Thanks.&amp;nbsp;&lt;/P&gt;&lt;P&gt;src\tests\test_calculate_psi_for_each_column.py:14: in &amp;lt;module&amp;gt;&lt;BR /&gt;spark = spu.create_spark_obj()&lt;BR /&gt;src\prototype\supplementary_utils.py:959: in create_spark_obj&lt;BR /&gt;spark = SparkSession.builder.appName("ABC").getOrCreate()&lt;BR /&gt;venv\lib\site-packages\pyspark\sql\session.py:269: in getOrCreate&lt;BR /&gt;sc = SparkContext.getOrCreate(sparkConf)&lt;BR /&gt;venv\lib\site-packages\pyspark\context.py:483: in getOrCreate&lt;BR /&gt;SparkContext(conf=conf or SparkConf())&lt;BR /&gt;venv\lib\site-packages\pyspark\context.py:197: in __init__&lt;BR /&gt;self._do_init(&lt;BR /&gt;venv\lib\site-packages\pyspark\context.py:282: in _do_init&lt;BR /&gt;self._jsc = jsc or self._initialize_context(self._conf._jconf)&lt;BR /&gt;venv\lib\site-packages\pyspark\context.py:402: in _initialize_context&lt;BR /&gt;return self._jvm.JavaSparkContext(jconf)&lt;BR /&gt;venv\lib\site-packages\py4j\java_gateway.py:1585: in __call__&lt;BR /&gt;return_value = get_return_value(&lt;BR /&gt;venv\lib\site-packages\py4j\protocol.py:326: in get_return_value&lt;BR /&gt;raise Py4JJavaError(&lt;BR /&gt;E py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.&lt;BR /&gt;E : java.lang.ExceptionInInitializerError&lt;BR /&gt;E at org.apache.spark.unsafe.array.ByteArrayMethods.&amp;lt;clinit&amp;gt;(ByteArrayMethods.java:56)&lt;BR /&gt;E at org.apache.spark.memory.MemoryManager.defaultPageSizeBytes$lzycompute(MemoryManager.scala:264)&lt;BR /&gt;E at org.apache.spark.memory.MemoryManager.defaultPageSizeBytes(MemoryManager.scala:254)&lt;BR /&gt;E at org.apache.spark.memory.MemoryManager.$anonfun$pageSizeBytes$1(MemoryManager.scala:273)&lt;BR /&gt;E at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)&lt;BR /&gt;E at scala.Option.getOrElse(Option.scala:189)&lt;BR /&gt;E at org.apache.spark.memory.MemoryManager.&amp;lt;init&amp;gt;(MemoryManager.scala:273)&lt;BR /&gt;E at org.apache.spark.memory.UnifiedMemoryManager.&amp;lt;init&amp;gt;(UnifiedMemoryManager.scala:58)&lt;BR /&gt;E at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:207)&lt;BR /&gt;E at org.apache.spark.SparkEnv$.create(SparkEnv.scala:320)&lt;BR /&gt;E at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)&lt;BR /&gt;E at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)&lt;BR /&gt;E at org.apache.spark.SparkContext.&amp;lt;init&amp;gt;(SparkContext.scala:464)&lt;BR /&gt;E at org.apache.spark.api.java.JavaSparkContext.&amp;lt;init&amp;gt;(JavaSparkContext.scala:58)&lt;BR /&gt;E at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)&lt;BR /&gt;E at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)&lt;BR /&gt;E at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)&lt;BR /&gt;E at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)&lt;BR /&gt;E at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;E at py4j.Gateway.invoke(Gateway.java:238)&lt;BR /&gt;E at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)&lt;BR /&gt;E at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)&lt;BR /&gt;E at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)&lt;BR /&gt;E at py4j.ClientServerConnection.run(ClientServerConnection.java:106)&lt;BR /&gt;E at java.base/java.lang.Thread.run(Thread.java:1570)&lt;BR /&gt;E Caused by: java.lang.IllegalStateException: java.lang.NoSuchMethodException: java.nio.DirectByteBuffer.&amp;lt;init&amp;gt;(long,int)&lt;BR /&gt;E at org.apache.spark.unsafe.Platform.&amp;lt;clinit&amp;gt;(Platform.java:113)&lt;BR /&gt;E ... 25 more&lt;BR /&gt;E Caused by: java.lang.NoSuchMethodException: java.nio.DirectByteBuffer.&amp;lt;init&amp;gt;(long,int)&lt;BR /&gt;E at java.base/java.lang.Class.getConstructor0(Class.java:3784)&lt;BR /&gt;E at java.base/java.lang.Class.getDeclaredConstructor(Class.java:2955)&lt;BR /&gt;E at org.apache.spark.unsafe.Platform.&amp;lt;clinit&amp;gt;(Platform.java:71)&lt;BR /&gt;E ... 25 more&lt;/P&gt;</description>
      <pubDate>Fri, 24 May 2024 15:27:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70595#M7331</guid>
      <dc:creator>hong</dc:creator>
      <dc:date>2024-05-24T15:27:54Z</dc:date>
    </item>
    <item>
      <title>Re: pytest error</title>
      <link>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70599#M7332</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;&lt;BR /&gt;The error message and stack trace doesn't seem to suggest that this is a failed pytest issue. If that's true, can you please try to replicate what's being invoked from `src\tests\test_calculate_psi_for_each_column.py` in a Python REPL?&lt;BR /&gt;&lt;BR /&gt;And separately, have you confirmed that the pyspark installation was successful? Depending on how you installed pyspark, are you able to start pyspark by itself (independent of `findspark`) such as `$SPARK_HOME/bin/pyspark`?&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 24 May 2024 17:46:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70599#M7332</guid>
      <dc:creator>brockb</dc:creator>
      <dc:date>2024-05-24T17:46:03Z</dc:date>
    </item>
    <item>
      <title>Re: pytest error</title>
      <link>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70833#M7333</link>
      <description>&lt;P&gt;When I directly run pyspark, it failed.&lt;/P&gt;&lt;P&gt;C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\Lib\site-packages\pyspark\bin\pyspark&lt;BR /&gt;Python 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32&lt;BR /&gt;Type "help", "copyright", "credits" or "license" for more information.&lt;BR /&gt;24/05/27 20:40:38 WARN Shell: Did not find winutils.exe: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see &lt;A href="https://wiki.apache.org/hadoop/WindowsProblems" target="_blank"&gt;https://wiki.apache.org/hadoop/WindowsProblems&lt;/A&gt;&lt;BR /&gt;Setting default log level to "WARN".&lt;BR /&gt;To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).&lt;BR /&gt;24/05/27 20:40:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable&lt;BR /&gt;24/05/27 20:40:40 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor). This may indicate an error, since only one SparkContext should be running in this JVM (see SPARK-2243). The other SparkContext was created at:&lt;BR /&gt;org.apache.spark.api.java.JavaSparkContext.&amp;lt;init&amp;gt;(JavaSparkContext.scala:58)&lt;BR /&gt;java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)&lt;BR /&gt;java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)&lt;BR /&gt;java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)&lt;BR /&gt;py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)&lt;BR /&gt;py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;py4j.Gateway.invoke(Gateway.java:238)&lt;BR /&gt;py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)&lt;BR /&gt;py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)&lt;BR /&gt;py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)&lt;BR /&gt;py4j.ClientServerConnection.run(ClientServerConnection.java:106)&lt;BR /&gt;java.base/java.lang.Thread.run(Thread.java:1570)&lt;BR /&gt;C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\Lib\site-packages\pyspark\bin\..\python\pyspark\shell.py:44: UserWarning: Failed to initialize Spark session.&lt;BR /&gt;warnings.warn("Failed to initialize Spark session.")&lt;BR /&gt;Traceback (most recent call last):&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\Lib\site-packages\pyspark\bin\..\python\pyspark\shell.py", line 39, in &amp;lt;module&amp;gt;&lt;BR /&gt;spark = SparkSession._create_shell_session()&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\sql\session.py", line 677, in _create_shell_session&lt;BR /&gt;return SparkSession._getActiveSessionOrCreate()&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\sql\session.py", line 693, in _getActiveSessionOrCreate&lt;BR /&gt;spark = builder.getOrCreate()&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\sql\session.py", line 269, in getOrCreate&lt;BR /&gt;sc = SparkContext.getOrCreate(sparkConf)&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\context.py", line 483, in getOrCreate SparkContext(conf=conf or SparkConf())&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\context.py", line 197, in __init__&lt;BR /&gt;self._do_init(&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\context.py", line 282, in _do_init&lt;BR /&gt;self._jsc = jsc or self._initialize_context(self._conf._jconf)&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\lib\site-packages\pyspark\context.py", line 402, in _initialize_context&lt;BR /&gt;return self._jvm.JavaSparkContext(jconf)&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\Lib\site-packages\pyspark\python\lib\py4j-0.10.9.5-src.zip\py4j\java_gateway.py", line 1585, in __call__&lt;BR /&gt;return_value = get_return_value(&lt;BR /&gt;File "C:\Users\HXI1\hx_scripts\core-globaldata-dpm-mlops-pipeline-1\venv\Lib\site-packages\pyspark\python\lib\py4j-0.10.9.5-src.zip\py4j\protocol.py", line 326, in get_return_value&lt;BR /&gt;raise Py4JJavaError(&lt;BR /&gt;py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.&lt;BR /&gt;: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.unsafe.array.ByteArrayMethods&lt;BR /&gt;at org.apache.spark.memory.MemoryManager.defaultPageSizeBytes$lzycompute(MemoryManager.scala:264)&lt;BR /&gt;at org.apache.spark.memory.MemoryManager.defaultPageSizeBytes(MemoryManager.scala:254)&lt;BR /&gt;at org.apache.spark.memory.MemoryManager.$anonfun$pageSizeBytes$1(MemoryManager.scala:273)&lt;BR /&gt;at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)&lt;BR /&gt;at scala.Option.getOrElse(Option.scala:189)&lt;BR /&gt;at org.apache.spark.memory.MemoryManager.&amp;lt;init&amp;gt;(MemoryManager.scala:273)&lt;BR /&gt;at org.apache.spark.memory.UnifiedMemoryManager.&amp;lt;init&amp;gt;(UnifiedMemoryManager.scala:58)&lt;BR /&gt;at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:207)&lt;BR /&gt;at org.apache.spark.SparkEnv$.create(SparkEnv.scala:320)&lt;BR /&gt;at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)&lt;BR /&gt;at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)&lt;BR /&gt;at org.apache.spark.SparkContext.&amp;lt;init&amp;gt;(SparkContext.scala:464)&lt;BR /&gt;at org.apache.spark.api.java.JavaSparkContext.&amp;lt;init&amp;gt;(JavaSparkContext.scala:58)&lt;BR /&gt;at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)&lt;BR /&gt;at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)&lt;BR /&gt;at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:238)&lt;BR /&gt;at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)&lt;BR /&gt;at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)&lt;BR /&gt;at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)&lt;BR /&gt;at py4j.ClientServerConnection.run(ClientServerConnection.java:106)&lt;BR /&gt;at java.base/java.lang.Thread.run(Thread.java:1570)&lt;BR /&gt;Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.ExceptionInInitializerError [in thread "Thread-2"]&lt;BR /&gt;at org.apache.spark.unsafe.array.ByteArrayMethods.&amp;lt;clinit&amp;gt;(ByteArrayMethods.java:56)&lt;BR /&gt;... 24 more&lt;/P&gt;&lt;P&gt;ERROR: The process with PID 33248 (child process of PID 43512) could not be terminated.&lt;BR /&gt;Reason: Access is denied.&lt;BR /&gt;SUCCESS: The process with PID 43512 (child process of PID 14852) has been terminated.&lt;BR /&gt;SUCCESS: The process with PID 14852 (child process of PID 42180) has been terminated.&lt;/P&gt;</description>
      <pubDate>Tue, 28 May 2024 01:43:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70833#M7333</guid>
      <dc:creator>hong</dc:creator>
      <dc:date>2024-05-28T01:43:53Z</dc:date>
    </item>
    <item>
      <title>Re: pytest error</title>
      <link>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70842#M7334</link>
      <description>&lt;P&gt;I dont personally have any experience running Spark on Windows. Can you please review the Wiki article referenced in the WARN message to see if it helps you complete the installation successfully? Or alternatively, could you consider running the tests on Databricks if you continue having issues with the Windows setup?&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Tue, 28 May 2024 04:07:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/70842#M7334</guid>
      <dc:creator>brockb</dc:creator>
      <dc:date>2024-05-28T04:07:11Z</dc:date>
    </item>
    <item>
      <title>Re: pytest error</title>
      <link>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/71109#M7335</link>
      <description>&lt;P&gt;Thank you very much, brockb. Probably I will try it in databricks. Thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 30 May 2024 17:00:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/pytest-error/m-p/71109#M7335</guid>
      <dc:creator>hong</dc:creator>
      <dc:date>2024-05-30T17:00:02Z</dc:date>
    </item>
  </channel>
</rss>

