- 5057 Views
- 2 replies
- 1 kudos
Hi, I'm running all my jobs on one big cluster, I'm just concerned is there a solution on how we could clear cache resulted by a notebook in the end of the job when its done? hence it does not causing any memory problem sometime from one to another, ...
- 5057 Views
- 2 replies
- 1 kudos
Latest Reply
Hi @krisna math We haven't heard from you since the last response from @Debayan Mukherjee , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to...
1 More Replies
by
jlgr
• New Contributor II
- 4384 Views
- 2 replies
- 0 kudos
Hi! I want to disable disk cache for SQL Warehouse in Azure Databricks, but it seems that is not possible. Is it correct?You can't use this configuration for SQL Warehouse (https://learn.microsoft.com/en-US/azure/databricks/optimizations/disk-cache#-...
- 4384 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @jlgr jlgr Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...
1 More Replies
- 30123 Views
- 9 replies
- 2 kudos
Hi all,I am using a persist call on a spark dataframe inside an application to speed-up computations. The dataframe is used throughout my application and at the end of the application I am trying to clear the cache of the whole spark session by calli...
- 30123 Views
- 9 replies
- 2 kudos
Latest Reply
No solution yet:Hi @Suteja Kanuri ,Thank you for thinking along and replying!Unfortunately, I have not found a solution yet.I am getting an error that there exists no ```.getCache()``` method on a spark context. Also note that I have tried to do som...
8 More Replies
- 2959 Views
- 2 replies
- 1 kudos
Here are the simple steps to reproduce it. Note that col "foo" and "bar" are just redundant cols to make sure the dataframe doesn't fit into a single partition. // generate a random df
val rand = new scala.util.Random
val df = (1 to 3000).map(i => (r...
- 2959 Views
- 2 replies
- 1 kudos
Latest Reply
Hi @Jerry Xu Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wil...
1 More Replies
by
fury88
• New Contributor II
- 2287 Views
- 1 replies
- 1 kudos
I'm trying to cache data/queries that we normally have as temporary views that get replaced when the code is run based on dynamic python. What I'd like to know is will CACHE TABLE get overwritten each time you run it? Is it smart enough to recognize ...
- 2287 Views
- 1 replies
- 1 kudos
Latest Reply
Hi @Matt Fury Yes...I guess cache overwrites each time you run it because for me it took nearly same amount of time for 1million records to be cached. However, you can check whether the table is cached or not using .storageLevel method. E.g. I have...
- 2896 Views
- 3 replies
- 0 kudos
Hello everybody,I recently discovered (the hard way) that when a query plan uses cached data, the AQE does not kick-in. Result is that you loose the super cool feature of dynamic partition coalesce (no more custom shuffle readers in the DAG). Is ther...
- 2896 Views
- 3 replies
- 0 kudos
Latest Reply
Hi @Pantelis Maroudis,Did you check the physical query plan? did you check the SQL sub tab with in Spark UI? it will help you to undertand better what is happening.
2 More Replies
by
368545
• New Contributor III
- 3255 Views
- 2 replies
- 2 kudos
We got the following error when running queries on Redash connected toDatabricks early today (2022-08-24):```Error running query: [HY000] [Simba][Hardy] (35) Error from server:error code: '0' error message:'org.apache.spark.sql.catalyst.expressions.U...
- 3255 Views
- 2 replies
- 2 kudos
Latest Reply
This can be related to user permission, particularly necessary permission to access the table in the database instance. I understand in SQL editor it is working fine, still can we check the permissions?
1 More Replies
- 3571 Views
- 5 replies
- 4 kudos
The problem:We have a dataframe which is based on the query:SELECT *
FROM Very_Big_TableThis table returns over 4 GB of data, and when we try to push the data to Power BI we get the error message:ODBC: ERROR [HY000] [Microsoft][Hardy] (35) Error from...
- 3571 Views
- 5 replies
- 4 kudos
Latest Reply
Hey @Hila Galapo Hope everything is going good. Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!
4 More Replies
- 6686 Views
- 1 replies
- 1 kudos
I've seen .cache() and .checkpoint() used similarly in some workflows I've come across. What's the difference, and when should I use one over the other?
- 6686 Views
- 1 replies
- 1 kudos
Latest Reply
Caching is extremely useful than checkpointing when you have lot of available memory to store your RDD or Dataframes if they are massive.Caching will maintain the result of your transformations so that those transformations will not have to be recomp...
- 946 Views
- 0 replies
- 0 kudos
When we run the sql statements "DROP TABLE .... CREATE TABLE" for the same table in multiple places (different notebooks, jobs, ...) some notebooks may not see the most recent schema / content.
- 946 Views
- 0 replies
- 0 kudos