by
ae20cg
• New Contributor III
- 17275 Views
- 17 replies
- 12 kudos
I want to run a block of code in a script and not in a notebook on databricks, however I cannot properly instantiate the spark context without some error.I have tried ` SparkContext.getOrCreate()`, but this does not work.Is there a simple way to do t...
- 17275 Views
- 17 replies
- 12 kudos
Latest Reply
Is there some solution for this.We got struck where a cluster having unity catalog is not able to get spark context.This is not allowing to use distributed nature of spark in databricks.
16 More Replies
- 826 Views
- 0 replies
- 0 kudos
I have a code:from time import sleep
from random import random
from operator import add
def f(a: int) -> float:
sleep(0.1)
return random()
rdd1 = sc.parallelize(range(20), 2)
rdd2 = sc.parallelize(range(20), 2)
rdd3 = sc.parallelize(rang...
- 826 Views
- 0 replies
- 0 kudos
by
Fed
• New Contributor III
- 7959 Views
- 1 replies
- 0 kudos
Tree-based estimators in pyspark.ml have an argument called checkpointIntervalcheckpointInterval = Param(parent='undefined', name='checkpointInterval', doc='set checkpoint interval (>= 1) or disable checkpoint (-1). E.g. 10 means that the cache will ...
- 7959 Views
- 1 replies
- 0 kudos
Latest Reply
@Federico Trifoglio :If sc.getCheckpointDir() returns None, it means that no checkpoint directory is set in the SparkContext. In this case, the checkpointInterval argument will indeed be ignored. To set a checkpoint directory, you can use the SparkC...
by
KateK
• New Contributor II
- 2348 Views
- 2 replies
- 1 kudos
I have some code that uses RDDs, and the sc.parallelize() and rdd.toDF() methods to get a dataframe back out. The code works in a regular notebook (and if I run the notebook as a job) but fails if I do the same thing in a DLT pipeline. The error mess...
- 2348 Views
- 2 replies
- 1 kudos
Latest Reply
Thanks for your help Alex, I ended up re-writing my code with spark UDFs -- maybe there is a better solution with only the Dataframe API but I couldn't find it. To summarize my problem: I was trying to un-nest a large json blob (the fake data in my f...
1 More Replies
- 14828 Views
- 5 replies
- 0 kudos
Why do I get this error on my browser screen,
<type 'exceptions.Exception'>: Java gateway process exited before sending the driver its port number args = ('Java gateway process exited before sending the driver its port number',) message = 'Java gat...
- 14828 Views
- 5 replies
- 0 kudos
Latest Reply
I'm facing the same problem, does anybody know how to connect Spark in Ipython notebook?
The issue I created,
https://github.com/jupyter/notebook/issues/743
4 More Replies