Confuse about large memory usage of cluster

guangyi — Mon, 05 Aug 2024 08:48:39 GMT

We set up a demo DLT pipeline with no data involved:

@Dlt.table( name="demo" ) def sample(): df = spark.sql("SELECT 'silver' as Layer") return df

However, when we check the metric of the cluster, it looks like 10GB memory has already been used which doesn’t make sense.

I noticed that the access mode for the cluster is “shard”. Does this mean the 10GB memory was consumed by other users maybe?

If so, do we use the cluster at the same time or do I take over this one after the other user finishes?

topic Confuse about large memory usage of cluster in Data Engineering