Databricks Community

m997al · ‎11-14-2023

When experimenting with LLMs on Databricks clusters, I have become interested in knowing if the LLM (Llama2 or otherwise), tries to make calls to the internet (i.e., the settings for use_remote_code=True in Huggingface models, as just one example).

More broadly, this is about the possibility of malicious code embedded in the LLMs.

The brute force solution is to run the LLMs from local files in Databricks, on a cluster that has it's network locked down. I suppose we could create a specific Databricks workspace with extremely limited network connections just for this, but is there a way at the compute cluster level to lock down the network, independent of other clusters?

Does this question make sense? Thanks!