Hi ,i need create a Custom pipeline for spell correction, is there any implemented for example fuzzywuzzy python library or is necessary write a new Custom pipeline?. I listen opinions. other opcions is create new pipeline among all, since I notice that it is a general problem.
You can make one custom component with pyspellchecker, it has the advantage that you can add your words and create a personalized dictionary. The way I made this, it should be added at the beginning of your pipeline.
from rasa.nlu.components import Component
from rasa.nlu.training_data import Message
import typing
from typing import Any, Optional, Text, Dict
if typing.TYPE_CHECKING:
from rasa.nlu.model import Metadata
from spellchecker import SpellChecker
class SpellCheckerEN(Component):
provides = ["text"]
defaults = {}
language_list = ["en"]
def __init__(self, component_config=None):
super(SpellCheckerEN, self).__init__(component_config)
def process(self, message, **kwargs):
mt = message.text
str = mt.translate(mt.maketrans('', '', '!\"#$%&\'()*+,.-:;<=>?@[\]^_`{|}~'))
words = str.split(' ')
spell = SpellChecker(language=None)
spell.word_frequency.load_dictionary('my_en_dict.gz')
for word in words:
if word not in spell:
mt = mt.replace(word, spell.correction(word))
message.text = mt
@classmethod
def load(
cls,
meta: Dict[Text, Any],
model_dir: Optional[Text] = None,
model_metadata: Optional["Metadata"] = None,
cached_component: Optional["Component"] = None,
**kwargs: Any
) -> "Component":
if cached_component:
return cached_component
else:
return cls(meta)
thank for the code
You are running a local installation of Rasa? I want to try this out, but we are running a Docker environment and I think I need to find a way to install the pyspellchecker module into the Rasa container… anybody tried this already?
Hey @wim.de.webmaster,
Actually i am building Docker environment and having trouble with it. would you mind sharing some of your material on that?
You can refer this article, where pyspellchecker python library is used for correcting spelling mistakes, custom component for spell checking