What text do you want to spell-check? I think spell-correcting the user input is usually not worth it (it creates more problems than it solves), and with enough NLU data you’ll capture the common typos of your users.
Forgive my straightness, but I completely disagree with you. Type of text or topic is not the point here. The issue is, when the user input is mistyped, sometimes even a single missing letter or a wrong letter can cause a fallback problem, in other words, nlu prediction may fail because of a single letter. Mistyping is very normal in daily life, even a native speaker might make misspellings. And you can not fix this problem with adding new NLU data. Can you imagine that you are going to put all possible combinations for a sentence with 100 characters, and making this for all your sentences in your NLU. The idea of “adding new mistyped sentences into NLU” is unrealistic. There need a solution which will make the correction at the begging of the pipeline, just like google engine, when you type a sentence in google engine it automatically corrects you. We need something like that.
I get your point, @huseyinyilmaz01. I am not saying that you have to enter all possible missspellings. Some mistakes are quite common, and if you use the CountVectorsFeaturizer with an n-gram setting, then your pipeline can pay attention to parts of words, which might help.
The problem with spell-correction is that the auto-correction might be wrong. Perhaps you could get around this if you forwarded both the original, as well as the auto-corrected text to the featurizers… In any case, what I’ve heard is that using the count vectors is better than spell correction. Might depend on the language though.
I’ll get back to you if I can find a spell checker that you can use with Rasa 2.
@huseyinyilmaz01 Looks like we don’t have any working spell-checker around. Feel free to implement one and contribute to the NLU Examples repo, though!
Hello @huseyinyilmaz01
I am sorry you feel that way. But we simply do not have a spell checker component that you could use. Rasa is open source and we encourage people to contribute to the code at any time. This is what I meant: If you want to implement a spell checker, we’d be very happy to see it as a contribution to our repo.