Data Engineering

Forum Posts

Sorted by:

by Alex_Persin • New Contributor III

10-28-2021 2:59:06 AM

10644 Views
6 replies
8 kudos

How can the shared memory size (/dev/shm) be increased on databricks worker nodes with custom docker images?

PyTorch uses shared memory to efficiently share tensors between its dataloader workers and its main process. However in a docker container the default size of the shared memory (a tmpfs file system mounted at /dev/shm) is 64MB, which is too small to ...

Data Engineering

10644 Views
6 replies
8 kudos

10-28-2021 2:59:06 AM

View Replies

Latest Reply

stevewb
New Contributor III

04-09-2025 6:54:01 AM

8 kudos

Bump again... does anyone have a solution for this?

8 kudos

04-09-2025 6:54:01 AM

5 More Replies

by SaraCorralLou • New Contributor III

02-03-2023 6:15:23 AM

42736 Views
5 replies
2 kudos

Resolved! Error: The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

What is the problem?I am getting this error every time I run a python notebook on my Repo in Databricks.BackgroundThe notebook where I am getting the error is a notebook that creates a dataframe and the last step is to write the dataframe to a Delta ...

Data Engineering

42736 Views
5 replies
2 kudos

02-03-2023 6:15:23 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 1:55:31 AM

2 kudos

Hi @Sara Corral Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

2 kudos

04-10-2023 1:55:31 AM

4 More Replies

by zzy • New Contributor III

02-01-2023 8:22:06 AM

3643 Views
3 replies
2 kudos

Why is pytorch cuda total memory not aligned with the memory size of GPU cluster I created?

No matter GPU cluster of which size I create, cuda total capacity is always ~16 Gb. Does anyone know what is the issue?The code I use to get the total capacity:torch.cuda.get_device_properties(0).total_memory

Data Engineering

3643 Views
3 replies
2 kudos

02-01-2023 8:22:06 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-08-2023 7:30:30 PM

2 kudos

Hi @Simon Zhang Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

2 kudos

04-08-2023 7:30:30 PM

2 More Replies

Databricks Community

How can the shared memory size (/dev/shm) be increased on databricks worker nodes with custom docker images?

Resolved! Error: The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

Why is pytorch cuda total memory not aligned with the memory size of GPU cluster I created?