Not sure if this topic is the right place to ask this, but was interested to discuss / seek advice about handling user spelling mistakes for text input to Rasa NLU.
It’s something I looked at a few months back but ended up putting to one side.
On one level, you could ignore spelling mistakes and simply include the misspellings in your training data. But I suspect that becomes unsustainable fairly quickly. Also, whilst it’s okay for intents, it doesn’t cope where you need to then use the entities for things (ie subsequent lookups)
I did have a basic spell corrector that I ran text through before sending the corrected text to Rasa. It was okay but had limitations (longer sentences and in particular longer words took dramatically longer amounts of time).
The corrector I used was based on one of the Python ports of SymSpell (I need to go check) https://github.com/wolfgarbe/SymSpell
Due to those limitations, I’d also wondered about only correcting entities once returned by Rasa, but then you’ve typically lost useful context. I didn’t get around to adding neighbouring words back, but that might be a fair compromise.
What approaches have people tried?