cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

run md5 using CLI

pshuk
New Contributor III

Hi,

I want to run a md5 checksum on the uploaded file to databricks. I can generate md5 on the local file but how do I generate one on uploaded file on databricks using CLI (Command line interface). Any help would be appreciated.

I tried running databricks fs md5 but it shows that md5 is not supported. 

1 REPLY 1

pshuk
New Contributor III

Thanks Kaniz. I do get the MD5 hash of the file locally and then I upload it to Databricks Volume. I suppose it is Delta Lake Gen 2 storage type, but I am not able to generate MD5 using my code (running on local machine) of this uploaded file. 

If we take a step back, the only reason I am doing MD5 checksum is to check the data integrity. If there is any other way, I can confirm that uploaded file from on-prem to Databricks volume is exactly same, then my problem would be solved. Any idea/suggestions?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group