Tokenziation

Tokenization is a way to break down a string of continuous text into small individual chunks called tokens, each token can be a word/character/sub-word.

Different tokenization schemes can be used based on different aspects.

Last updated