Databricks

ws-kp · ‎05-05-2023

Hi there,

referring to this LangChain document here: SQL Database Agent — 🦜🔗 LangChain 0.0.157

is it possible to use LangChain’s SQL Database Agent with Databricks SQL and Dolly?

If so, could someone kindly advise what the Python syntax would be?

db = SQLDatabase.from_uri(.....)

agent_executor = dolly

Thanks.

Wes

sean_owen · ‎06-02-2023

This pattern works for me:

from sqlalchemy.engine import create_engine
from langchain import SQLDatabase, SQLDatabaseChain
 
engine = create_engine(
  "databricks+connector://token:dapi...@....cloud.databricks.com:443/default",
  connect_args={"http_path": "/sql/1.0/warehouses/...",})
 
db = SQLDatabase(engine, schema="default", include_tables=["nyc_taxi"])

View solution in original post

Anonymous · ‎05-13-2023

@Wesley Shen :

it seems like LangChain's SQL Database Agent is designed to work with any SQL database that supports JDBC connections, which includes Databricks SQL. However, it's unclear whether it works with Dolly as Dolly is not mentioned in the documentation.

Assuming that LangChain's SQL Database Agent works with Databricks SQL, you can use the following Python code to create an instance of SQLDatabase from the URI of your Databricks SQL endpoint:

from langchain_sql_database_agent.sql_database import SQLDatabase
 
# replace <your-databricks-sql-uri> with the URI of your Databricks SQL endpoint
databricks_sql_uri = "<your-databricks-sql-uri>"
 
db = SQLDatabase.from_uri(databricks_sql_uri)

Once you have created an instance of SQLDatabase, you can use it to create an instance of SQLAgentExecutor to execute SQL queries against your Databricks SQL database. Here is an example:

from langchain_sql_database_agent.sql_agent_executor import SQLAgentExecutor
 
# create an instance of SQLAgentExecutor using the SQLDatabase instance created earlier
agent_executor = SQLAgentExecutor(db)
 
# execute a SQL query
result = agent_executor.execute_query("SELECT * FROM my_table")
 
# process the result
for row in result:
    print(row)

Note that the exact syntax may depend on the version of LangChain's SQL Database Agent you are using, as well as any specific configuration options you need to set for your Databricks SQL endpoint.

ws-kp · ‎05-14-2023

thanks @Suteja Kanuri , but what I'm wanting to know is the how to populate the parameters specific to Databricks e.g. what would be the syntax for <your-databricks-sql-uri> be in this case?

Thanks

Etyr · ‎05-15-2023

I had to pip install sqlalchemy-databricks.

from langchain import SQLDatabase
from sqlalchemy.engine import URL
 
TOKEN = "MY-TOKEN"
HOST = "your cloud host .net/.com" # The url for the host
PORT = 443 # your port
DB = "your DB"
CATALOG = "hive_metastore" # The default catalog, can change depending of your configuration
HTTP_PATH = "/sql/1.0/xxxxxxxxxxx"
 
 
URI = URL.create(
    "databricks",
    username="token",
    password=TOKEN,
    host=HOST,
    port=PORT,
    database=DB,
    query={
        "http_path": HTTP_PATH,
        "catalog": CATALOG,
        "schema": DB
    }
)
 
db = SQLDatabase.from_uri(URI)

Help from https://www.andrewvillazon.com/connect-databricks-sqlalchemy/ and https://github.com/hwchase17/langchain/issues/2277

ws-kp · ‎05-23-2023

Hi @Antoine Tavernier

Nice work for trying this. I came across this too and tried it myself with no success.

Etyr · ‎05-24-2023

@Wesley Shen What is you error? We cannot help you with out any error messages.

It can come from a typo in your settings, a expired token, your firewall not giving you access to your cluster. There are a lot of issues possible.

You can also try to use the Spark SQL Agent (recently added in langchain), directly inside a databricks notebook.

https://python.langchain.com/en/latest/modules/agents/toolkits/examples/spark_sql.html

sean_owen · ‎06-02-2023

This pattern works for me:

from sqlalchemy.engine import create_engine
from langchain import SQLDatabase, SQLDatabaseChain
 
engine = create_engine(
  "databricks+connector://token:dapi...@....cloud.databricks.com:443/default",
  connect_args={"http_path": "/sql/1.0/warehouses/...",})
 
db = SQLDatabase(engine, schema="default", include_tables=["nyc_taxi"])

sean_owen · ‎06-02-2023

Dolly works fine with SQLChain. There is no need to support specific models from Hugging Face, as any model on HF can be plugged in. Load any pipeline and ...

from langchain.llms import HuggingFacePipeline
 
hf_pipeline = HuggingFacePipeline(pipeline=pipe)

Anonymous · ‎05-19-2023

Hi @Wesley Shen

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!