cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SindhujaRaghupa
by New Contributor II
  • 6990 Views
  • 3 replies
  • 1 kudos

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 4, localhost, executor driver): java.lang.NullPointerException

I have uploaded a csv file which have well formatted data and I was trying to use display(questions) where questions=spark.read.option("header","true").csv("/FileStore/tables/Questions.csv")This is throwing an error as follows:SparkException: Job abo...

  • 6990 Views
  • 3 replies
  • 1 kudos
Latest Reply
SS2
Valued Contributor
  • 1 kudos

You can use inferschema​

  • 1 kudos
2 More Replies
lprevost
by New Contributor II
  • 1643 Views
  • 1 replies
  • 1 kudos

Resolved! Schema inferrence CSV picks up \r carriage returns

I'm using: frame = spark.read.csv(path=bucket+folder, inferSchema = True, header = True, multiLine=True ) to read in a series of CSV ...

  • 1643 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Files saved in Windows operation system contain carriage return and line feed in every line.Please add following option it can help: .option("ignoreTrailingWhiteSpace", true)

  • 1 kudos
tourist_on_road
by New Contributor
  • 4702 Views
  • 1 replies
  • 0 kudos

How to read binary data in pyspark

I'm reading binary file http://snap.stanford.edu/data/amazon/productGraph/image_features/image_features.b using pyspark.from io importStringIO import array img_embedding_file = sc.binaryRecords("s3://bucket/image_features.b",4106)def mapper(featur...

  • 4702 Views
  • 1 replies
  • 0 kudos
Latest Reply
shyam_9
Valued Contributor
  • 0 kudos

Hi @tourist_on_road, please go through the below spark docs,https://spark.apache.org/docs/2.3.0/api/python/pyspark.html#pyspark.SparkContext.binaryFiles

  • 0 kudos
Labels