cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SaraCorralLou
by New Contributor III
  • 13610 Views
  • 5 replies
  • 2 kudos

Resolved! Error: The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

What is the problem?I am getting this error every time I run a python notebook on my Repo in Databricks.BackgroundThe notebook where I am getting the error is a notebook that creates a dataframe and the last step is to write the dataframe to a Delta ...

  • 13610 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sara Corral​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 2 kudos
4 More Replies
zzy
by New Contributor III
  • 1256 Views
  • 3 replies
  • 2 kudos

Why is pytorch cuda total memory not aligned with the memory size of GPU cluster I created?

No matter GPU cluster of which size I create, cuda total capacity is always ~16 Gb. Does anyone know what is the issue?The code I use to get the total capacity:torch.cuda.get_device_properties(0).total_memory

  • 1256 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Simon Zhang​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

  • 2 kudos
2 More Replies
Alex_Persin
by New Contributor II
  • 3221 Views
  • 2 replies
  • 2 kudos

How can the shared memory size (/dev/shm) be increased on databricks worker nodes with custom docker images?

PyTorch uses shared memory to efficiently share tensors between its dataloader workers and its main process. However in a docker container the default size of the shared memory (a tmpfs file system mounted at /dev/shm) is 64MB, which is too small to ...

  • 3221 Views
  • 2 replies
  • 2 kudos
Latest Reply
mstuder
New Contributor II
  • 2 kudos

Also interested in increasing shared memory for use with ray

  • 2 kudos
1 More Replies
Labels