cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Tables saving as corrupt files

kaleighspitz
New Contributor

Hello,

I am using Delta Live Tables to store data and then trying to save them to ADLS. I've specified the storage location of the Delta Live Tables in my Delta Live Tables pipeline. However, when I check the files that are saved in ADLS, they are corrupt octet stream files.

I have tried the following:

  • Created a new Delta Live Tables pipeline with a different storage location
  • Restarted the Delta Live Tables pipeline

Has anyone else ran into this issue / does anyone know how to fix this?

Thanks,

Kaleigh

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @kaleighspitz , 

If your Delta Lake files are saving as corrupt files in ADLS, it's possible that there is an issue with the data being written to the file system. Here are a few suggestions to help troubleshoot the issue:

  1. Verify the integrity of the data saved in your Delta Live Tables: This can be done by running queries or Spark transformations to verify that the data being written to Delta Lake is correct. You can use the delta.tables. API to query the Delta Lake tables and perform various transformations on the data.

  2. Check the storage location configuration: Ensure that you have specified the correct storage location for the Delta Live Table pipeline. Make sure that the storage location exists and is correctly formatted. You can verify the storage location in the Delta Live Table pipeline settings in the Databricks UI.

  3. Check the ADLS account configuration: Verify that the ADLS account is configured correctly and has the necessary permissions to write data to the specified storage location. Make sure that you have the correct credentials and that the storage account is accessible.

  4. Check the file format: Verify that you have specified the correct format for the files being written to ADLS. Make sure the file format is consistent with the format of the data being written to Delta Lake.

  5. Check file encoding: If you're writing text files, make sure that the encoding of the text files being written to ADLS is compatible with the encoding expected by the file system. You can check and change the file encoding using the encoding parameter in the open() method when writing to files.

  6. Check cluster configuration: If you're running into issues related to the writing of files to ADLS, it could be related to the cluster configuration. To ensure the cluster configuration is set correctly, verify the Spark version you're using, verify the number of executors, and ensure that you have enough storage space to store all of the files you're writing.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.