Tokenization is a way to break down a string of continuous text into small individual chunks called tokens, each token can be a word/character/sub-word.
Different tokenization schemes can be used based on different aspects.
So let's say your tokenizer divided the whole text corpus into T tokens i.e. there are your text is divided into total Tchunks of smaller words, characters, sub-words, etc. Now this Tbecomes the Vocabulary size of your model.