Numbers as training data

How can I use numbers as training data?

In my case, I have numerical symbols for each intention. e.g:1 for affirm and 2 for deny. Or 3804 for Intention A and 3805 for intention B.

Per my testing, in “supervised_embeddings” pipeline, it can’t tell between numbers. I have training data 1 for affirm, and training data 2 for deny. But it will always pick intent affirm for me no matter I input 1 or 2. It’s likely to parse all numbers as same intention.

I know it this can be solved by non-NLP programming handling. But it will be better if I can use training data to achieve this.


I think this will not be possible via the supervised_embeddings pipeline. If your NLU data contain number and normal text for one intent, it will be impossible for the classifier to learn, that a certain number correspond to a certain intent. Numbers are also replaced by a unified token in the pipeline, so that the actual number is not seen by the classifier at all.

However, you can write your own intent classifier to achieve this, something like rasa/ at master · RasaHQ/rasa · GitHub, for example. Just check if the input messages is equal to a certain number and return the corresponding intent.

Does that help?