Why is tokenization needed before a LanguageModelFeaturizer?

I cannot understand this especially when BERT is used for the LanguageModelFeaturizer. The input of BERT has no need to be tokenized after all. Is this for NER?