Spell Checking issue


What is the best solution, or is there any solution of RASA for spell checking?

I checked topics in Rasa Forum. Some developers mention “SymSpellpy”, but there is no implementation or any sample regarding this tool.

I will be glad if you share the solution and your implementations. Thank you.

Hello @huseyinyilmaz01

What text do you want to spell-check? I think spell-correcting the user input is usually not worth it (it creates more problems than it solves), and with enough NLU data you’ll capture the common typos of your users.

Forgive my straightness, but I completely disagree with you. Type of text or topic is not the point here. The issue is, when the user input is mistyped, sometimes even a single missing letter or a wrong letter can cause a fallback problem, in other words, nlu prediction may fail because of a single letter. Mistyping is very normal in daily life, even a native speaker might make misspellings. And you can not fix this problem with adding new NLU data. Can you imagine that you are going to put all possible combinations for a sentence with 100 characters, and making this for all your sentences in your NLU. The idea of “adding new mistyped sentences into NLU” is unrealistic. There need a solution which will make the correction at the begging of the pipeline, just like google engine, when you type a sentence in google engine it automatically corrects you. We need something like that.

I get your point, @huseyinyilmaz01. I am not saying that you have to enter all possible missspellings. Some mistakes are quite common, and if you use the CountVectorsFeaturizer with an n-gram setting, then your pipeline can pay attention to parts of words, which might help.

The problem with spell-correction is that the auto-correction might be wrong. Perhaps you could get around this if you forwarded both the original, as well as the auto-corrected text to the featurizers… In any case, what I’ve heard is that using the count vectors is better than spell correction. Might depend on the language though.

I’ll get back to you if I can find a spell checker that you can use with Rasa 2.

Hi @j.mosig, I look forward to hearing from you. Rasa should have a meaningful solution/s for this issue since all the things run around texts.

Thank you

@huseyinyilmaz01 Looks like we don’t have any working spell-checker around. Feel free to implement one and contribute to the NLU Examples repo, though!

Hi @j.mosig,

What do you mean by “we don’t have any … Feel free to implement one and contribute to”.

I have been disappointed with the way of responding of a Rasa team member. It is not polite and professional.

Hello @huseyinyilmaz01 I am sorry you feel that way. But we simply do not have a spell checker component that you could use. Rasa is open source and we encourage people to contribute to the code at any time. This is what I meant: If you want to implement a spell checker, we’d be very happy to see it as a contribution to our repo.

1 Like