03-02-2023 12:21 AM
I am getting below error some time run my databricks notebook from ADF,
If the executor node is one then it works fine, if it increases 2 or more some times its failing on same data.
Cluster Detail : Standard_F4s_v2 · Workers: Standard_F4s_v2 · 1-8 workers · 11.2 (includes Apache Spark 3.3.0, Scala 2.12)
File "/databricks/python/lib/python3.9/site-packages/Levenshtein/__init__.py", line 343 in opcodes
File "/databricks/python/lib/python3.9/site-packages/fuzzywuzzy/StringMatcher.py", line 45 in get_opcodes
File "/databricks/python/lib/python3.9/site-packages/fuzzywuzzy/StringMatcher.py", line 58 in get_matching_blocks
File "/databricks/python/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py", line 47 in partial_ratio
File "/databricks/python/lib/python3.9/site-packages/fuzzywuzzy/utils.py", line 47 in decorator
File "/databricks/python/lib/python3.9/site-packages/fuzzywuzzy/utils.py", line 29 in decorator
File "/databricks/python/lib/python3.9/site-packages/fuzzywuzzy/utils.py", line 38 in decorator
File "/databricks/python/lib/python3.9/site-packages/my_package/my_function.py", line 30 in scrap_url
File "/databricks/python/lib/python3.9/site-packages/my_package/my_function.py", line 124 in my_function
File "<command-1514877556254536>", line 20 in my_function
File "<command-1514877556254534>", line 7 in my_function_01
File "/databricks/spark/python/pyspark/util.py", line 84 in wrapper
File "/databricks/spark/python/pyspark/worker.py", line 130 in <lambda>
File "/databricks/spark/python/pyspark/worker.py", line 591 in mapper
File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 384 in init_stream_yield_batches
File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 91 in dump_stream
File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 391 in dump_stream
File "/databricks/spark/python/pyspark/worker.py", line 885 in process
File "/databricks/spark/python/pyspark/worker.py", line 893 in main
File "/databricks/spark/python/pyspark/daemon.py", line 79 in worker
File "/databricks/spark/python/pyspark/daemon.py", line 204 in manager
File "/databricks/spark/python/pyspark/daemon.py", line 229 in <module>
File "/usr/lib/python3.9/runpy.py", line 87 in _run_code
File "/usr/lib/python3.9/runpy.py", line 197 in _run_module_as_main
Any one please help me on this . is that time lag issue with a task?
Some times its working and sometimes dont.
If didn't add package "Levenshtein" 3 tasks are taking 2 h to complete.
03-02-2023 01:24 AM
@Kaniz Fatma Please help me on this
03-06-2023 07:00 PM
My solution is updating Python. I updated python to 3.10.9 solved my problem, when I try to use SparkTrial() in hpyeropt fmin()
Error: org.apache.spark.SparkException: Python worker exited unexpectedly (crashed)
03-16-2023 10:34 PM
Hi @Ancil P A
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
03-23-2023 01:31 AM
Not resolved issue, I trimmed data length to process as 1cr then its working
03-30-2023 01:58 AM
Hi @Ancil P A
Help us build a vibrant and resourceful community by recognizing and highlighting insightful contributions. Mark the best answers and show your appreciation!
04-04-2023 02:21 PM
Hi @Ancil P A
Can you give paste the complete stacktrace from the failed task (from failed stage 10.0) and the code snippet that you are trying to run in the notebook . Also, do you think you can raise a databricks support ticket for the same?
04-10-2023 12:12 AM
I have pasted all the logs available in data bricks
05-09-2023 06:22 AM
Hi @Swetha Nandajan
Please find the full error log. I have job running every one hour my notebook worked for 20 runs. after that am getting below error.
Am creating new job cluster for every run from ADF.
Cluster details : Driver: Standard_F32s_v2 · Workers: Standard_F32s_v2 · 1 worker · 11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12).
I tried the data with error in my QA environment its working, but in testing environment after 20 runs am getting below error.
Error
Attached as a file
Please help me
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group