- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2025 06:42 AM
Notebooks in my workspace are not returning _sqldf when a sql query is run.
If I run this code, it would give an error in second cell that _sqldf is not defined.
First Cell:
%sql
select * from some_table limit 10
Second Cell:
%sql
select * from _sqldf
However, the same code runs fine in other people's notebook in my organization. I suspect it started when I connected my notebook with sql warehouse as my whole notebook was in sql rather than an all purpose compute. Now I cannot run this above code even when I make a new notebook.
Can anybody suggest how to fix this.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-07-2025 05:41 AM
Changing the notebook to default python and all purpose compute have fixed the issue. I am able to access _sqldf in subsequent sql or python cell.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2025 06:53 AM
Hi @Somia,
If you run that function in all-purpose does it work fine?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 09:32 AM
I switched my notebook back to all purpose compute as _sqldf is not supported with sql warehouse notebook. It didn't work as explained above.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 12:31 PM - edited 02-06-2025 12:33 PM
To replicate your scenario, you need an All-Purpose Cluster and a notebook defaulted to python language. Then query a table using %sql as below. This creates a temp dataframe for you to use it in the python cells. Keep in mind, that this dataframe keep changing as you execute a different %sql cell.
%sql
-- cell 1
select * from catalog.schema.123_sample
# cell 2
display(_sqldf)
To summarize, the %sql magic command behaves differently depending on whether your Databricks notebook is connected to an All-Purpose cluster or a SQL Warehouse.
- All-Purpose Cluster: %sql creates a DataFrame named _sqldf that you can use in subsequent Python cells.
- SQL Warehouse: %sql executes the query but does not create the _sqldf DataFrame.
Please let me know for anything, else mark it as a solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 02:19 PM
@Somia _sqldf is a pyspark df not a sql object.
It works only for such direction
exec sql -> call _sqldf in pyspark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-07-2025 05:41 AM
Changing the notebook to default python and all purpose compute have fixed the issue. I am able to access _sqldf in subsequent sql or python cell.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-07-2025 10:07 AM
@Somia Can you mark my detailed explanation as solution that helped to resolve your issue.

