Databricks Community

shamly · ‎11-17-2022

when I try to read parquet file from Azure datalake container from databricks, I am getting spark exception. Below is my query

import pyarrow.parquet as pq

from pyspark.sql.functions import *

from datetime import datetime

data = spark.read.parquet(f"/mnt/data/country/abb/countrydata.parquet")

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 14.0 failed 4 times, most recent failure: Lost task 0.3 in stage 14.0 (TID 35) (10.135.39.71 executor 0): org.apache.spark.SparkException: Exception thrown in awaitResult:

what does this mean? What I need to do for this?

Debayan · ‎11-17-2022

Hi @shamly pt , Could you please post the full error stack here?

DavideAnghileri · ‎11-19-2022

Hi @shamly pt , more info are needed to solve the issue. However common problems are:

The storage is not mount
That file doesn't exists in the mounted storage

Also, there is no need to use an f-string if there are no curly brackets with expressions in the string, so you can remove the f in `f"/mnt/data/country/abb/countrydata.parquet"`

Databricks Community

spark exception error while reading a parquet file

Join Us as a Local Community Builder!

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions

🚀 Weekly Delta (24-30 September): A Look Back at This Week’s Top Community Highlights!

Announcing Data Intelligence for Cybersecurity