cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

take() ooperation is throwing error

shelly
New Contributor
Traceback (most recent call last):
  File "/usr/local/spark/python/pyspark/serializers.py", line 458, in dumps
    return cloudpickle.dumps(obj, pickle_protocol)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 73, in dumps
    cp.dump(obj)
  File "/usr/local/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 602, in dump
    return Pickler.dump(self, obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 692, in reducer_override
    return self._function_reduce(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 565, in _function_reduce
    return self._dynamic_function_reduce(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

operation:

x=[1,2,3,4,5,6,7]

rdd = sc.parallelize(x)

print (rdd.take(2))

2 REPLIES 2

pvignesh92
Honored Contributor

Hi @Shelly Bhardwajโ€‹ This should work. Can you restart your Jupiter terminal and execute this and check?

Anonymous
Not applicable

@Shelly Bhardwajโ€‹ :

The error message you provided seems to be incomplete, as it only shows the traceback of a serialization error. Can you provide the full error message or describe the issue in more detail?

Regarding the code you provided, it looks correct and should return the first two elements of the RDD without any errors.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now