Then you will have a problem unfortunately because from the perspective of the engine the principal is not allowed to see/use the real values from dataset1.
The principal needs to be able to unmask the data from dataset1 for the actual "value" to be able to join to dataset2.
If they only see the masked value then you're joining *** to 123 which will not work.
The masking rules should be setup in some sort of centralized fashion (ideally using using Governed Tags) so that the same rules apply to the same classification of data on all the datasets in a catalog/schema (using ABAC). Else you are gonna run into these inconsistencies.