- With WhiteSpace tokenizer and CountVectorsFeaturizer, We generated more data and chatbot’s classification performance increases significantly.
- When we have pre-trained BERT embedding, we are arguing about 2 approaches:
Generate more data with fine-tuning BERT language model, then feed into data-training to improve classification performance?
No need generating more training data because NLU with pre-trained BERT embedding can understand sentence and similar sentences already.
How do you think about it? Which approach is better?