<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Cannot load spark-avro jars with databricksversion 10.4 in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/cannot-load-spark-avro-jars-with-databricksversion-10-4/m-p/34429#M25182</link>
    <description>&lt;P&gt;Currently, I am facing an issue since the `databricks-connect` runtime on our cluster was updated to 10.4. Since then, I cannot load the jars for spark-avro anymore. By Running the following code &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
&amp;nbsp;
spark = SparkSession.builder.config("spark.jars.packages", "org.apache.spark:spark-avro_2.12:3.3.0").getOrCreate()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I get the following error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;The jars for the packages stored in: C:\Users\lazlo\.ivy2\jars
&amp;nbsp;
org.apache.spark#spark-avro_2.12 added as a dependency
&amp;nbsp;
:: resolving dependencies :: org.apache.spark#spark-submit-parent-dc011dfd-9d25-4d6f-9d0e-354626e7c1f8;1.0
&amp;nbsp;
   confs: [default]
&amp;nbsp;
   found org.apache.spark#spark-avro_2.12;3.3.0 in central
&amp;nbsp;
   found org.tukaani#xz;1.8 in central
&amp;nbsp;
   found org.spark-project.spark#unused;1.0.0 in central
&amp;nbsp;
:: resolution report :: resolve 156ms :: artifacts dl 4ms
&amp;nbsp;
   :: modules in use:
&amp;nbsp;
   org.apache.spark#spark-avro_2.12;3.3.0 from central in [default]
&amp;nbsp;
   org.spark-project.spark#unused;1.0.0 from central in [default]
&amp;nbsp;
   org.tukaani#xz;1.8 from central in [default]
&amp;nbsp;
   ---------------------------------------------------------------------
&amp;nbsp;
   |                 |           modules           ||  artifacts  |
&amp;nbsp;
   |      conf      | number| search|dwnlded|evicted|| number|dwnlded|
&amp;nbsp;
   ---------------------------------------------------------------------
&amp;nbsp;
   |     default    |  3  |  0  |  0  |  0  ||  3  |  0  |
&amp;nbsp;
   ---------------------------------------------------------------------
&amp;nbsp;
:: retrieving :: org.apache.spark#spark-submit-parent-dc011dfd-9d25-4d6f-9d0e-354626e7c1f8
&amp;nbsp;
   confs: [default]
&amp;nbsp;
   0 artifacts copied, 3 already retrieved (0kB/5ms)
&amp;nbsp;
22/08/16 13:15:57 WARN Shell: Did not find winutils.exe: {}&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;...&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Traceback (most recent call last):
&amp;nbsp;
 File "C:/Aifora/repositories/test_poetry/tmp_jars.py", line 4, in 
&amp;nbsp;
   spark = SparkSession.builder.config("spark.jars.packages", "org.apache.spark:spark-avro_2.12:3.3.0").getOrCreate()
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\sql\session.py", line 229, in getOrCreate
&amp;nbsp;
   sc = SparkContext.getOrCreate(sparkConf)
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 400, in getOrCreate
&amp;nbsp;
   SparkContext(conf=conf or SparkConf())
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 147, in __init__
&amp;nbsp;
   self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 210, in _do_init
&amp;nbsp;
   self._jsc = jsc or self._initialize_context(self._conf._jconf)
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 337, in _initialize_context
&amp;nbsp;
   return self._jvm.JavaSparkContext(jconf)
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\py4j\java_gateway.py", line 1568, in __call__
&amp;nbsp;
   return_value = get_return_value(
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\py4j\protocol.py", line 326, in get_return_value
&amp;nbsp;
   raise Py4JJavaError(
&amp;nbsp;
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If important: I use a windows machine (Windows 11) and manage the packages via poetry. Here my pyproject.toml&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;[tool.poetry]
&amp;nbsp;
name = "test_poetry"
&amp;nbsp;
version = "1.37.5"
&amp;nbsp;
description = ""
&amp;nbsp;
authors = [
&amp;nbsp;
    "lazloo xp ",
&amp;nbsp;
 ]
&amp;nbsp;
&amp;nbsp;
[[tool.poetry.source]]
&amp;nbsp;
name = "***_nexus"
&amp;nbsp;
url = "https://nexus.infrastructure.xxxx.net/repository/pypi-all/simple/"
&amp;nbsp;
default = true
&amp;nbsp;
&amp;nbsp;
[tool.poetry.dependencies]
&amp;nbsp;
python = "==3.8.*"
&amp;nbsp;
databricks-connect = "^10.4"
&amp;nbsp;
&amp;nbsp;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 17 Aug 2022 05:55:45 GMT</pubDate>
    <dc:creator>Lazloo</dc:creator>
    <dc:date>2022-08-17T05:55:45Z</dc:date>
    <item>
      <title>Cannot load spark-avro jars with databricksversion 10.4</title>
      <link>https://community.databricks.com/t5/data-engineering/cannot-load-spark-avro-jars-with-databricksversion-10-4/m-p/34429#M25182</link>
      <description>&lt;P&gt;Currently, I am facing an issue since the `databricks-connect` runtime on our cluster was updated to 10.4. Since then, I cannot load the jars for spark-avro anymore. By Running the following code &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
&amp;nbsp;
spark = SparkSession.builder.config("spark.jars.packages", "org.apache.spark:spark-avro_2.12:3.3.0").getOrCreate()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I get the following error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;The jars for the packages stored in: C:\Users\lazlo\.ivy2\jars
&amp;nbsp;
org.apache.spark#spark-avro_2.12 added as a dependency
&amp;nbsp;
:: resolving dependencies :: org.apache.spark#spark-submit-parent-dc011dfd-9d25-4d6f-9d0e-354626e7c1f8;1.0
&amp;nbsp;
   confs: [default]
&amp;nbsp;
   found org.apache.spark#spark-avro_2.12;3.3.0 in central
&amp;nbsp;
   found org.tukaani#xz;1.8 in central
&amp;nbsp;
   found org.spark-project.spark#unused;1.0.0 in central
&amp;nbsp;
:: resolution report :: resolve 156ms :: artifacts dl 4ms
&amp;nbsp;
   :: modules in use:
&amp;nbsp;
   org.apache.spark#spark-avro_2.12;3.3.0 from central in [default]
&amp;nbsp;
   org.spark-project.spark#unused;1.0.0 from central in [default]
&amp;nbsp;
   org.tukaani#xz;1.8 from central in [default]
&amp;nbsp;
   ---------------------------------------------------------------------
&amp;nbsp;
   |                 |           modules           ||  artifacts  |
&amp;nbsp;
   |      conf      | number| search|dwnlded|evicted|| number|dwnlded|
&amp;nbsp;
   ---------------------------------------------------------------------
&amp;nbsp;
   |     default    |  3  |  0  |  0  |  0  ||  3  |  0  |
&amp;nbsp;
   ---------------------------------------------------------------------
&amp;nbsp;
:: retrieving :: org.apache.spark#spark-submit-parent-dc011dfd-9d25-4d6f-9d0e-354626e7c1f8
&amp;nbsp;
   confs: [default]
&amp;nbsp;
   0 artifacts copied, 3 already retrieved (0kB/5ms)
&amp;nbsp;
22/08/16 13:15:57 WARN Shell: Did not find winutils.exe: {}&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;...&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Traceback (most recent call last):
&amp;nbsp;
 File "C:/Aifora/repositories/test_poetry/tmp_jars.py", line 4, in 
&amp;nbsp;
   spark = SparkSession.builder.config("spark.jars.packages", "org.apache.spark:spark-avro_2.12:3.3.0").getOrCreate()
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\sql\session.py", line 229, in getOrCreate
&amp;nbsp;
   sc = SparkContext.getOrCreate(sparkConf)
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 400, in getOrCreate
&amp;nbsp;
   SparkContext(conf=conf or SparkConf())
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 147, in __init__
&amp;nbsp;
   self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 210, in _do_init
&amp;nbsp;
   self._jsc = jsc or self._initialize_context(self._conf._jconf)
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\pyspark\context.py", line 337, in _initialize_context
&amp;nbsp;
   return self._jvm.JavaSparkContext(jconf)
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\py4j\java_gateway.py", line 1568, in __call__
&amp;nbsp;
   return_value = get_return_value(
&amp;nbsp;
 File "C:\Users\lazlo\AppData\Local\pypoetry\Cache\virtualenvs\test-poetry-vvodToDL-py3.8\lib\site-packages\py4j\protocol.py", line 326, in get_return_value
&amp;nbsp;
   raise Py4JJavaError(
&amp;nbsp;
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If important: I use a windows machine (Windows 11) and manage the packages via poetry. Here my pyproject.toml&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;[tool.poetry]
&amp;nbsp;
name = "test_poetry"
&amp;nbsp;
version = "1.37.5"
&amp;nbsp;
description = ""
&amp;nbsp;
authors = [
&amp;nbsp;
    "lazloo xp ",
&amp;nbsp;
 ]
&amp;nbsp;
&amp;nbsp;
[[tool.poetry.source]]
&amp;nbsp;
name = "***_nexus"
&amp;nbsp;
url = "https://nexus.infrastructure.xxxx.net/repository/pypi-all/simple/"
&amp;nbsp;
default = true
&amp;nbsp;
&amp;nbsp;
[tool.poetry.dependencies]
&amp;nbsp;
python = "==3.8.*"
&amp;nbsp;
databricks-connect = "^10.4"
&amp;nbsp;
&amp;nbsp;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Aug 2022 05:55:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cannot-load-spark-avro-jars-with-databricksversion-10-4/m-p/34429#M25182</guid>
      <dc:creator>Lazloo</dc:creator>
      <dc:date>2022-08-17T05:55:45Z</dc:date>
    </item>
  </channel>
</rss>

