Hello!
I'm using a server less SQL cluster on Data bricks and I have a dataset on Delta Table that has 500 billion rows. I'm trying to filter to have around 7 billion and the cache that dataset to use it on other queries and make it run faster.
When I cache the table it takes 1s and gives no error/warning.
When I select the cache table it gives and error that cannot be found.
This is what I'm doing:
CACHE TABLE table_filtered_cache AS select * from prod_datalake.table a
WHERE
a.year >= 2023 etc
and then
select count(*) from table_filtered_cache
What am I doing wrong, and what would you advise me to do?