Databricks Community

vigneshkannan12 · ‎06-07-2024

I am trying to install the stanza library and try to create a udf function to create NER tags for my chunk_text in the dataframe.

Cluster Config: DBR 14.3 LTS SPARK 3.5.0 SCALA 2.12

below code:

def extract_entities(text😞

import stanza

nlp = stanza.Pipeline('en', processors='tokenize,ner', use_gpu=False)

doc = nlp(text)

entities = [(entity.text, entity.type) for sentence in doc.sentences for entity in sentence.ents]

return entities

# Register the UDF

entity_udf = udf(extract_entities, ArrayType(StructType([

StructField("text", StringType(), True),

StructField("type", StringType(), True)

])))

df=spark.sql("select * from datafabric_catalog.gen_ai.wiki limit 1")

df_with_entities = df.withColumn("entities", entity_udf(df["chunk_text"]))

it throws the following error:

from typing_extensions import Literal, Match, TypedDict ImportError: cannot import name 'Match' from 'typing_extensions' (/databricks/python3/lib/python3.10/site-packages/typing_extensions.py)

Kaizen · ‎07-01-2024

Did you end up solving this?

I am getting a similar issue on DB with 13.4 and 14.4 LTS.
ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python3/lib/python3.10/site-packages/typing_extensions.py)
== Stacktrace ==
File "/home/spark-f18133fd-0fbd-4dd5-8ab0-a6/.ipykernel/1940/command-1617845290167209-140570505", line 39, in <lambda>
File "/home/spark-f18133fd-0fbd-4dd5-8ab0-a6/.ipykernel/1940/command-1617845290167209-140570505", line 22, in get_embedding
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f18133fd-0fbd-4dd5-8ab0-a6fe7e41de88/lib/python3.10/site-packages/langchain_openai/__init__.py", line 1, in <module>
from langchain_openai.chat_models import AzureChatOpenAI, ChatOpenAI
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f18133fd-0fbd-4dd5-8ab0-a6fe7e41de88/lib/python3.10/site-packages/langchain_openai/chat_models/__init__.py", line 1, in <module>
from langchain_openai.chat_models.azure import AzureChatOpenAI
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f18133fd-0fbd-4dd5-8ab0-a6fe7e41de88/lib/python3.10/site-packages/langchain_openai/chat_models/azure.py", line 22, in <module>
import openai
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f18133fd-0fbd-4dd5-8ab0-a6fe7e41de88/lib/python3.10/site-packages/openai/__init__.py", line 6, in <module>
from typing_extensions import override
at com.databricks.sql.execution.safespark.SafesparkErrorMessages$.createSparkRuntimeException(SafesparkErrorMessages.scala:134)
at com.databricks.sql.execution.safespark.SafesparkErrorMessages$.convertT

SaadhikaB · ‎08-07-2024

Was this resolved?

I installed openai and tried to import and faced the below error. I also tried upgrading the libraries. However I ended up with the same error.

Code: import openai
Error: ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python/lib/python3.10/site-packages/typing_extensions.py)

pip --version => pip 24.2
openai -version => 1.40.1
typing_extensions => 4.12.2

Cluster config:

DBR = 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12)

ssling0817 · 3 weeks ago

Hi, do you resolve this issue? I also met this issue while using llm.

SaadhikaB · 3 weeks ago

Hi, I was able to resolve this when I restarted the python library in Databricks.
Let me know if that helps

Code:

pip install openai

dbutils.library.restartPython()

from openai import AzureOpenAI

Optimusprime · 3 weeks ago

@SaadhikaB
Hi, when I run dbutils.library.restartPython(), I get the following error

Databricks Community

typing extensions import match error

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April