Topics with Label: Memory

Forum Posts

Sorted by:

by wojciech_jakubo • New Contributor III

06-21-2023 6:25:01 AM

4654 Views
7 replies
2 kudos

Question about monitoring driver memory utilization

Hi databricks/spark experts!I have a piece on pandas-based 3rd party code that I need to execute as a part of a bigger spark pipeline. By nature, pandas-based code is executed on driver node. I ran into out of memory problems and started exploring th...

Data Engineering

4654 Views
7 replies
2 kudos

06-21-2023 6:25:01 AM

View Replies

Latest Reply

Tharun-Kumar
Honored Contributor II

08-16-2023 12:59:57 PM

2 kudos

Hi @wojciech_jakubo 1. JVM memory will not be utilized for python related activities. 2. In the image we could only see the storage memory. We also have execution memory which would also be the same. Hence I came up with the executor memory to be of ...

2 kudos

08-16-2023 12:59:57 PM

6 More Replies

by GC-James • Contributor II

03-04-2022 7:34:53 AM

5999 Views
17 replies
5 kudos

Resolved! Lost memory when using dbutils

Why does copying a 9GB file from a container to the /dbfs lose me 50GB of memory? (Which doesn't come back until I restarted the cluster)

Data Engineering

5999 Views
17 replies
5 kudos

03-04-2022 7:34:53 AM

View Replies

Latest Reply

AdrianP
New Contributor II

07-11-2023 1:56:28 AM

5 kudos

Hi James,Did you get to the bottom of this? We are experiencing the same issue, and all the suggested solutions don't seem to work.Thanks,Adrian

5 kudos

07-11-2023 1:56:28 AM

16 More Replies

by Abhijeet • New Contributor III

01-07-2023 6:01:59 AM

1598 Views
5 replies
5 kudos

How to Read Terabytes of data in Databricks

I want to read 1000 GB data. As in spark we do in memory transformation. Do I need worker nodes with combined size of 1000 GB.Also Just want to understand if will reading we store 1000 GB in memory. So how the Cache Data frame is different from the a...

Data Engineering

1598 Views
5 replies
5 kudos

01-07-2023 6:01:59 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

01-07-2023 11:24:59 PM

5 kudos

Hi @Abhijeet Singh below blog might help you-Link

5 kudos

01-07-2023 11:24:59 PM

4 More Replies

by gpzz • New Contributor II

01-07-2023 10:25:09 PM

819 Views
2 replies
1 kudos

MEMORY_ONLY not working

val doubledAmount = premiumCustomers.map(x=>(x._1, x._2*2)).persist(StorageLevel.MEMORY_ONLY) error: not found: value StorageLevel

Data Engineering

819 Views
2 replies
1 kudos

01-07-2023 10:25:09 PM

View Replies

Latest Reply

Chaitanya_Raju
Honored Contributor

01-07-2023 11:50:06 PM

1 kudos

Hi @Gaurav Poojary ,Can you please try the below as displayed in the image it is working for me without any issues.Happy Learning!!

1 kudos

01-07-2023 11:50:06 PM

1 More Replies

by Bujji • New Contributor II

11-10-2022 1:14:30 AM

3165 Views
2 replies
3 kudos

How to resolve our of memory error?

Hi, I am working as azure support engineerI found this error while I am checking the pipeline failure, and showing below error"org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 72403.0 failed 4 times, most recent fail...

Data Engineering

3165 Views
2 replies
3 kudos

11-10-2022 1:14:30 AM

View Replies

Latest Reply

Kaniz
Community Manager

11-21-2022 11:24:50 AM

3 kudos

Hi @mahesh bmk, We haven’t heard from you since the last response from @Pat Sienkiewicz, and I was checking back to see if their suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be helpful to...

3 kudos

11-21-2022 11:24:50 AM

1 More Replies

by pjp94 • Contributor

09-19-2022 11:19:43 AM

1852 Views
1 replies
0 kudos

ERROR - Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

I get the below error when trying to run multi-threading - fails towards the end of the run. My guess is it's related to memory/worker config. I've seen some solutions involving modifying the number of workers or CPU on the cluster - however that's n...

Data Engineering

1852 Views
1 replies
0 kudos

09-19-2022 11:19:43 AM

View Replies

Latest Reply

pjp94
Contributor

09-19-2022 12:56:47 PM

0 kudos

Since I don't have permissions to change cluster configurations, the only solution that ended up working was setting a max thread count to about half of the actual max so I don't overload the containers. However, open to any other optimization ideas!

0 kudos

09-19-2022 12:56:47 PM

by chandan_a_v • Valued Contributor

05-05-2022 11:23:48 PM

10007 Views
7 replies
6 kudos

Resolved! Spark Driver Out of Memory Issue

Hi, I am executing a simple job in Databricks for which I am getting below error. I increased the Driver size still I faced same issue. Spark config :from pyspark.sql import SparkSessionspark_session = SparkSession.builder.appName("Demand Forecasting...

Data Engineering

10007 Views
7 replies
6 kudos

05-05-2022 11:23:48 PM

View Replies

Latest Reply

Kaniz
Community Manager

05-13-2022 4:03:11 AM

6 kudos

Hi @Chandan Angadi, Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer) and @Werner Stinckens's responses help you to find the solution? Please let us know.

6 kudos

05-13-2022 4:03:11 AM

6 More Replies

by pavanb • New Contributor II

04-05-2022 4:50:37 AM

7371 Views
3 replies
3 kudos

Resolved! memory issues - databricks

Hi All, All of a sudden in our Databricks dev environment, we are getting exceptions related to memory such as out of memory , result too large etc.Also, the error message is not helping to identify the issue.Can someone please guide on what would be...

Data Engineering

7371 Views
3 replies
3 kudos

04-05-2022 4:50:37 AM

View Replies

Latest Reply

pavanb
New Contributor II

04-06-2022 2:43:20 AM

3 kudos

Thanks for the response @Hubert Dudek .if i run the same code in test environment , its getting successfully completed and in dev its giving out of memory issue. Also the configuration of test nand dev environment is exactly same.

3 kudos

04-06-2022 2:43:20 AM

2 More Replies

by Kaniz • Community Manager

09-21-2021 10:43:20 AM

443 Views
0 replies
0 kudos

How can I increase the memory available for Apache spark executor nodes?

Data Engineering

443 Views
0 replies
0 kudos

09-21-2021 10:43:20 AM

by User16869510359 • Esteemed Contributor

06-24-2021 7:51:39 AM

4658 Views
1 replies
0 kudos

Resolved! Do ganglia report incorrect memory stats?

I am looking at the memory utilization of the executors and I see the heap utilization of the executor is far less than what is reported in the Ganglia. Why do ganglia report incorrect memory details.

Data Engineering

4658 Views
1 replies
0 kudos

06-24-2021 7:51:39 AM

View Replies

Latest Reply

User16869510359
Esteemed Contributor

06-24-2021 8:18:10 AM

0 kudos

Ganglia reports the memory utilization at the system level. Say for example if the JVM has Xmx value of 100 GB. At some point, it will occupy 100GB and then with a Garbage collection, it will clear off the heap. Once the GC frees up the memory, th...

0 kudos

06-24-2021 8:18:10 AM

by Juan_MiguelTrin • New Contributor

03-23-2020 3:23:16 PM

6208 Views
1 replies
0 kudos

How to resolve our of memory error?

I have a data bricks notebook hosted on Azure. I am having this problem when doing INNER JOIN. I tried creating a much higher cluster configuration but it still making outOfMemoryError. org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquir...

Data Engineering

6208 Views
1 replies
0 kudos

03-23-2020 3:23:16 PM

View Replies

Latest Reply

shyam_9
Valued Contributor

03-30-2020 1:30:56 PM

0 kudos

Hi @Juan Miguel Trinidad,can you please the below suggestions,http://apache-spark-developers-list.1001551.n3.nabble.com/java-lang-OutOfMemoryError-Unable-to-acquire-bytes-of-memory-td16773.html

0 kudos

03-30-2020 1:30:56 PM