pdiamond
Contributor

I've found the JDBC query to be faster than the federated query because in our testing, the federated query does not pass down the full query to the source database. Instead, it's running "select * from table", pulling all of the data into Databricks and then filtering it before displaying/returning to the notebook. The direct JDBC query method passes the entire query down and the filtering, etc happens in the source database and only the data I need gets retrieved and sent to Databricks.

We noticed this behavior with several different queries of an on-prem SQL Server.