Databricks Community

shamly · ‎11-17-2022

when I try to read parquet file from Azure datalake container from databricks, I am getting spark exception. Below is my query

import pyarrow.parquet as pq

from pyspark.sql.functions import *

from datetime import datetime

data = spark.read.parquet(f"/mnt/data/country/abb/countrydata.parquet")

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 14.0 failed 4 times, most recent failure: Lost task 0.3 in stage 14.0 (TID 35) (10.135.39.71 executor 0): org.apache.spark.SparkException: Exception thrown in awaitResult:

what does this mean? What I need to do for this?

Debayan · ‎11-17-2022

Hi @shamly pt , Could you please post the full error stack here?

DavideAnghileri · ‎11-19-2022

Hi @shamly pt , more info are needed to solve the issue. However common problems are:

The storage is not mount
That file doesn't exists in the mounted storage

Also, there is no need to use an f-string if there are no curly brackets with expressions in the string, so you can remove the f in `f"/mnt/data/country/abb/countrydata.parquet"`

Databricks Community

spark exception error while reading a parquet file

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon