run md5 using CLI

pshuk
New Contributor III

Hi,

I want to run a md5 checksum on the uploaded file to databricks. I can generate md5 on the local file but how do I generate one on uploaded file on databricks using CLI (Command line interface). Any help would be appreciated.

I tried running databricks fs md5 but it shows that md5 is not supported. 

pshuk
New Contributor III

Thanks Kaniz. I do get the MD5 hash of the file locally and then I upload it to Databricks Volume. I suppose it is Delta Lake Gen 2 storage type, but I am not able to generate MD5 using my code (running on local machine) of this uploaded file. 

If we take a step back, the only reason I am doing MD5 checksum is to check the data integrity. If there is any other way, I can confirm that uploaded file from on-prem to Databricks volume is exactly same, then my problem would be solved. Any idea/suggestions?