ilarsen
Contributor

Hey.  Yep, xxhash64 (or even just hash) generate numerical values for you.  Combine with abs function to ensure the value is positive.  In our team we used abs(hash()) ourselves... for maybe a day.  Very quickly I observed a collision, and the data set was not at all large.  Of course what columns you include matter, but a large factor would have been enforcing a positive value.

 

We dropped the abs and went with xxhash64.  It is probably big enough, but use case matters.  Now we're considering the seemingly tried-and-true SHA2 256 instead.