Databricks Community

excavator-matt · ‎09-03-2025

In Databricks Runtime 16.4 LTS for Machine Learning, I am used to be able to import sentence-transformers without any installation as it is part of the runtime with from sentence_transformers import SentenceTransformer.

In this case I am running on a personal compute cluster and although it generates some advanced warnings that I don't know how to interpret, it runs

2025-09-03 09:39:48.853218: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2025-09-03 09:39:48.869748: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2025-09-03 09:39:48.890311: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2025-09-03 09:39:48.896580: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-09-03 09:39:48.911486: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2025-09-03 09:39:50.080574: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

However, if I upgrade my cluster to run the latest 17.2 machine learning (Beta). I instead get this crash.

Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
/databricks/python/lib/python3.12/site-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs
[Trace ID: 00-91c7cd652996226fe8747ad97efc53e6-f7a274cbb6f58e49-00]

Both runtimes include sentence-transformers although version 3.4.1 versus 4.0.1, so it should work. Is this no longer supported? Is it known to be broken in the beta?

excavator-matt · a week ago

I now upgraded to the new 17.3 LTS ML and it now works. I didn't try 17.2 ML, but with 17.3 ML available, I don't see any reason to use it anymore.

View solution in original post

Khaja_Zaffer · ‎09-03-2025

Hello @excavator-matt

Which version of pytorch and CUDA are you using? Which version of flash attention are you using?

if there’s a matching wheel for your environment, you can download it from https://github.com/Dao-AILab/flash-attention/releases/

excavator-matt · ‎09-03-2025

I simply run import in in the runtime, so it would it be the runtime default (CUDA 12.6, torch 2.70 and flash_attn 2.7.4.post1). I haven't tried overriding any of these. Not sure if there is a known incompatibility here.

Khaja_Zaffer · ‎09-03-2025

I am trying to find solution for your issue but since morning, I am unable to use databricks community editions, the cluster just spins without error.

Khaja_Zaffer · ‎09-07-2025

So after running upgrade, I was able to use

model="Qwen/Qwen2.5-1.5B-Instruct"

Khaja_Zaffer · ‎09-07-2025

I just checked

run pip install --upgrade transformers

I believe you are working on RAG?

excavator-matt · ‎09-08-2025

I am not sure if it is possible to replace sentence-transformers with transformers, but I think it should be possible to import sentence-transformers it is an official part of the runtime. In this case, I was more interested in the resulting vectors from sentence-transformers encode than any reply.

Khaja_Zaffer · ‎09-08-2025

Hello @excavator-matt

Thanks for your response.

I dont have a workspace to test on 17.2 BETA ML cluster. Highly recommended to use LTS based cluster.

however: You can try running upgrade transformers and run sentence_transformers.

After the pip install --upgrade transformers, run from sentence_transformers import SentenceTransformer again in your 17.2 ML Beta cluster. If it still crashes with the undefined symbol error, the upgrade alone might not have fixed the flash-attn compatibility issue (as transformers 4.41.2 might still rely on the runtime's pre-built flash-attn 2.7.4.post1).

Thank you.