cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to analyze external table | FileAlreadyExistsException

Miasu
New Contributor II

Hello experts, 

There's a csv file, "nyc_taxi.csv" saved under users/myfolder on DBFS, and I used this file created 2 tables:

1. nyc_taxi : created using the UI, and it appeared as a managed table saved under dbfs:/user/hive/warehouse/mydatabase.db/nyc_taxi

2. nyc_taxi2: created using the SQL commands below, and it shows as an external table, location: dbfs:/users/myfolder/nyc_taxi.csv

CREATE TABLE nyc_taxi2 
(vendor_id String,
pickup_datetime timestamp,
dropoff_datetime timestamp,
passenger_count int,
trip_distance double,
pickup_longitude double,
pickup_latitude double,
rate_code int,
store_and_fwd_flag string,
dropoff_longitude double,
dropoff_latitude double,
payment_type string,
fare_amount double,
surcharge double,
mta_tax double,
tip_amount double,
tolls_amount double,
total_amount double)
USING CSV OPTIONS("path"="/users/myfolder/nyc_taxi.csv","header" = "true");

The command below for nyc_taxi worked fine, 

 ANALYZE TABLE nyc_taxi compute statistics for all columns;

 whereas the same command for nyc_taxi2 raised a FileAlreadyExistsException error. (other commands (SELECT...FROM) works fine with the nyc_taxi2 table, but only the ANALYZE TABLE command so far)

ANALYZE TABLE nyc_taxi2 compute statistics for all columns;

[FileAlreadyExistsException: Operation failed: "The specified path, or an element of the path, exists and its resource type is invalid for this operation.", 409, GET,......, PathConflict, "The specified path, or an element of the path, exists and its resource type is invalid for this operation.]

How can I resolve the issue? 

Thanks for the help! 

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @Miasu, As you executed the ANALYZE TABLE command for the nyc_taxi2 table, a FileAlreadyExistsException appeared, revealing that the desired path already exists and is not suitable for the operation. To find a resolution, let's delve into some potential solutions: Firstly, review the specified location for the nyc_taxi2 table. Double check that it accurately reflects the location of the CSV file, nyc_taxi.csv, within your DBFS (dbfs:/users/myfolder/nyc_taxi.csv). If the location is indeed incorrect, rectify it using the ALTER TABLE command.

 

Ensuring the accuracy of data is crucial when working with external tables like nyc_taxi2 in Spark. However, it's important to note that Spark does not oversee the management of data files in these tables. This means that it expects the data to already exist in the specified location. To guarantee the availability of the required data, it's recommended to manually scan the directory where the CSV file is stored (dbfs:/users/myfolder/). If any outdated files related to nyc_taxi2 are present in this directory, be sure to remove them. This can be easily done through the Databricks UI or the Databricks CLI. Once the location is verified and any conflicting files are deleted, retry the ANALYZE TABLE nyc_taxi2 compute statistics for all columns; command. If the issue persists, it may be worth considering a cluster restart before trying again.

 

Make sure you have the required permissions to access the designated location, both for reading and writing. Make sure that the user executing the command has the proper authorization. If all else fails, you may want to consider creating a new directory for the nyc_taxi2 table and making the necessary changes to its definition. For instance, you can create a new folder, such as dbfs:/users/myfolder/nyc_taxi2_data/, and modify the table's location to reflect this change.

 

If you encounter any further issues, feel free to ask for additional assistance! 😊

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!