Hi All,
Sometimes to load data we want to only update a row based on changes in values. SCD1 type scenarios for a data warehouse. One approach is equivalency A=A, B=B etc. Another is generating a hash of all rows of interest, I believe pretty common. I have found several approaches as to how concatenating a row that will always successfully hash.
Firstly lets get concat() out of the way, it doesn't handle nulls. Any null and you get no hash as you end up with null output. Wrapping every column in isNull isn't great when you have 100 columns to deal with, although AI dev tools do help it still ends up looking untidy to me.
Next concat_ws() looks good as an option it can handle nulls, stringifies the row and produces a string output regardless of datatypes
Coalesce() also can, but appears limited to only a certain data types within a construct, so may fail on many typical structures where you have date, int and string data for example in a row. It would require creating strings for say Date data types. This may be a misunderstanding of least common datatype precedence, but someone can maybe explain what creates a typical risk in this case
Finally a but left field, but create a json string using to_json(). This will never fail to create a string with data in it that can be hashed. It may get a bit long in it's construct.
My questions:
Are there any edge cases where concat_ws will fail to produce a string when any single column in the row is populated?
Has anyone done any tests of the performance of any of these to know how one will fail to perform in large data situations?
My crude performance testing shows them very similar, I am sure there are tested results which provides a win for one

In the above when I take out scheduling time for the queries concat_ws returned 37 rows/sec and to_json returned 28 rows/sec I have run the test a few times to pick up on any nuances and this result difference looks to be caching. If I run the test with the other query first the second is always fastest.
https://docs.databricks.com/aws/en/sql/language-manual/functions/concat
https://docs.databricks.com/aws/en/sql/language-manual/functions/coalesce
https://docs.databricks.com/aws/en/sql/language-manual/functions/concat_ws
https://docs.databricks.com/aws/en/sql/language-manual/functions/to_json