cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Confuse about large memory usage of cluster

guangyi
Contributor II

We set up a demo DLT pipeline with no data involved:

 

 

@Dlt.table(
    name="demo"
)

def sample():
    df = spark.sql("SELECT 'silver' as Layer")
    return df

 

However, when we check the metric of the cluster, it looks like 10GB memory has already been used which doesn’t make sense.

I noticed that the access mode for the cluster is “shard”. Does this mean the 10GB memory was consumed by other users maybe?

If so,  do we use the cluster at the same time or do I take over this one after the other user finishes?

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @guangyi , Please check the cluster metrics by navigating to the Compute section and selecting the Metrics tab to monitor memory usage. If memory consumption is high, consider optimizing your cluster configuration with properties like spark.databricks.delta.optimizeWrite.enabled and spark.databricks.delta.autoCompact.enabled. Additionally, coordinating with other users for dedicated access or setting up a separate cluster for your tasks might help reduce resource contention. For more details, refer to the Databricks cluster metrics guide.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group