just to make one thing sure: You have
3000 distinct entities and for those, you provide
n samples per each intent? If so, I’d recommend to use a spacy pipeline and to enhance the used spacy model by retraining its custom entities e.g. like described here (Medium) or here (spacy doc).
This way, you could tell Rasa to auto fill the slots by using the spacy-detected entities or to simply extract them if you don’t use slots. In any way, this will reduce the training time of Rasa significantly.
catastrophical forgetting problem in mind.
If you have
3000 samples containing
n custom entities, then most likely there is a problem with your setup since this is not a significant amount of training data. The CRFEntityExtractor uses the scikit-crfsuite which is known to be pretty fast.