cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with reading exported tables stored in parquet

shiva12494
New Contributor II

Hi All, I am exported all tables from postgres snapshot into S3 in parquet format. I am trying to read the table using databricks and i am unable to do so. I get the following error: "Unable to infer schema for Parquet. It must be specified manually." I tried specifying the schema it still wont work. I dint need to specify schema to read parquet files before this so wondering whats different with this, i also tried to copy the parquet file to local and got an error relating to ciphertext.I have attached the error and file name screenshots.Any help is appreciated.

2 REPLIES 2

Anonymous
Not applicable

@shiva charan velichala​ :

It's possible that the parquet files that you exported from postgres snapshot were encrypted or compressed. If that's the case, you'll need to decrypt and/or decompress the files before you can read them with Databricks.

Additionally, if the schema is not being inferred correctly, you can specify the schema manually using the schema parameter of the read function in Databricks. For example:

from pyspark.sql.types import StructType, StructField, StringType, IntegerType
 
my_schema = StructType([
  StructField("column1", StringType(), True),
  StructField("column2", IntegerType(), True),
  ...
])
 
df = spark.read.schema(my_schema).parquet("/path/to/parquet/files")

Replace column1, column2, etc. with the actual column names in your schema.

If you're still having issues, you may want to try opening the parquet files in another program (such as Apache Arrow) to see if you're able to access them there.

Anonymous
Not applicable

Hi @shiva charan velichala​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group