Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-26-2025 05:13 AM
I just found the definition and it is indeed word piece tokenization.
So I think the tutorial is wrong.
https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5/blob/main/tokenizer.json