Kronos
New Contributor II

I just found the definition and it is indeed word piece tokenization. 

So I think the tutorial is wrong. 

https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5/blob/main/tokenizer.json