cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

spark exception error while reading a parquet file

shamly
New Contributor III

when I try to read parquet file from Azure datalake container from databricks, I am getting spark exception. Below is my query

import pyarrow.parquet as pq

from pyspark.sql.functions import *

from datetime import datetime

data = spark.read.parquet(f"/mnt/data/country/abb/countrydata.parquet")

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 14.0 failed 4 times, most recent failure: Lost task 0.3 in stage 14.0 (TID 35) (10.135.39.71 executor 0): org.apache.spark.SparkException: Exception thrown in awaitResult:

what does this mean? What I need to do for this?

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi @shamly pt​ , Could you please post the full error stack here?

DavideAnghileri
Contributor

Hi @shamly pt​ , more info are needed to solve the issue. However common problems are:

  • The storage is not mount
  • That file doesn't exists in the mounted storage

Also, there is no need to use an f-string if there are no curly brackets with expressions in the string, so you can remove the f in `f"/mnt/data/country/abb/countrydata.parquet"`

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.