cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

chanansh
by Contributor
  • 1437 Views
  • 3 replies
  • 0 kudos

delta table grouping by key which is not partitioned by is very slow

I have a big data delta table with timestamp, key and metric(s) columns (e.g. m1, m2, ...).I often will group by the key (e.g. select max(m1) group by timestamp, key).I cannot partition by `key` because there are too many values( ~200K).I have tried ...

  • 1437 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Hanan Shteingart​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 0 kudos
2 More Replies
Retko
by Contributor
  • 13951 Views
  • 5 replies
  • 8 kudos

Databricks notebook sometime takes too long to run query (even on empty table)

Hi,sometime I notice that running a query takes too long - even simple queries - and next time when I run same query it runs much faster. I have cluster running (DBR 10.4 LTS • 5 workers) and it has constantly several workers.An Example of query is s...

  • 13951 Views
  • 5 replies
  • 8 kudos
Latest Reply
j_afanador
Contributor II
  • 8 kudos

Probably the cluster is always in use and the query always falls into the processing query, or the cluster auto stops every time that you use it.

  • 8 kudos
4 More Replies
Labels