How do I use the OOV token in my rasa bot? I read about it in the documentation but did not get a clear idea about how to use it. Where in my file(action,domain,nlu) do I insert the oov_word or oov_token?? I need a stepwise solution to this to not make it more tiresome that it already is âŚ
Example : I have to get any reason for an employee leave application. Kindly guide me.
Also, since I have used supervised_embeddings as the pipeline, do I change the pipeline or do I need to add âCountVectorsFeaturizerâ to the existing pipeline?
Thank, @JiteshGaikwad, you would then use the oov token in your NLU training data. For example, our rasa-demo bot uses an oov value in an enter_data intent:
## intent:enter_data
- my budget is oov
- oov
- oov per year
@stephens@JiteshGaikwad@btotharye@JulianGerhard It works if I remove my default fallback response. So either the fallback will work or the oov will work. It accepts the flow but when i try to print the value of slot âreasonâ, it shows âNoneâ. Is it because i have used âoovâ with an entity/slot??
If i enter anything from this intent, the reason is printed fine. But if i enter anything except this, it should print whatever is entered as a reason. but it does not. Please help.
I want to join this discussion, as I found no ready recipe for this particular problem in any of the forum branches. It took me more than 1 week of frustration and a huge number of attempts to crack this task. IMHO, this should be described in the very first intro Rasa guide, as extracting any entities from user inputs (based on a set of example inputs having exactly the same structure) is arguably the most popular thing one wants from the bot.
Environment:
Rasa Version : 2.0.3
Rasa SDK Version : 2.0.0
Rasa X Version : None
Python Version : 3.7.7
Operating System : Darwin-19.6.0-x86_64-i386-64bit
Here is the config pipeline that worked for me to finally extract any word user puts in the training phrase:
Note the use of CRFEntityExtractor, it is an important part of success, as with default RegexFeaturizer and LexicalSyntacticFeaturizer I couldnât make it work.
Then in the nlu.yml file I use training phrases like these:
- I want to buy some [oranges](fruits)
- I want to buy some [mandarines](fruits)
- I want to buy some [grapefruits](fruits)
- I want to buy some [kiwis](fruits)
- I want to buy some _oov_
With this setup the bot grabs any word that user puts in place of _oov_, so if I put âI want to buy some BMWsâ, it will recognise âBMWsâ as fruits and will save it to the slot. You might have to handle this later since BMWs are clearly not fruits, but that is a completely different story. I hope this helps somebody.
FYI, after some more trials Iâve figured out that oov recognition does not happen at all with DIETclassifier, but works sometimes with CRFEntityExtractor if I provided at least 10 test phrases with different words in place of oov token.
Nevertheless, it stopped working after Iâve added more modified variations of test phrases (rephrased in different but very similar words).
Maybe one has to be a NLP pro to quickly and successfully build a bot with Rasa, but for me as a complete beginner it takes enormous effort and still the results make zero sense.
@Dar0n Note that CRFEntityExtractor doesât even use the OOV token in your configuration, because it is stated before the OOV token. Use this link for more info on pipeline order.