Every time when run bot with custom component getting "token none"

I created a custom component using Rasa 2.7.1, where I want to use tokens. But every time, if I run the chatbot, I get the error “tokens none”. Same as in this post: Tokens None from previous component(RASA custom component). I implemented the suggested solution, but it did not solve the problem. The chatbot is in hungarian (using Spacy implementation). I tried to run the bot in English too (rewrote it both in custom component and config file too), but nothing changed, I got the same results. Below you can see the codes “custom_component.py” and “config.yml” for better understanding.

Used forum posts for the project: Using FuzzyWuzzy with lookup tables Tokens None from previous component(RASA custom component)

Thank you in advance for your help.

custom_component.py

from rasa.nlu.components import Component from rasa.shared.nlu.training_data.message import Message from fuzzywuzzy import process

class FuzzyExtractor(Component): name = “Fuzzy” provides = [“entities”] requires = [“tokens”] defaults = {} language_list = [“hu”] threshold = 85

def __init__(self, component_config=None, *args):
    super(FuzzyExtractor, self).__init__(component_config)

def train(self, training_data, cfg, **kwargs):
    pass
    

def process(self, message, **kwargs):
    
    try:
        entities=list(message.get('entities'))
    except:
        entities = {}

    
    cur_path = os.path.dirname(__file__)
    if os.name == 'nt':
        partial_lookup_file_path = '.\data\city.yml'
    else:
        partial_lookup_file_path = './data/city.yml'
    lookup_file_path = os.path.join(cur_path, partial_lookup_file_path)

    with open(lookup_file_path, 'r', encoding = "utf8") as file:
        lookup_data = yaml.load(file, Loader=yaml.FullLoader)
        print(lookup_data.get("city"))
        
        
        try:
            tokens =  [t.text for t in message.get("tokens")]
            print('tokens', tokens)
        except:
            print("An exception occurred")
        
        
        try:
            for token in tokens:
                fuzzy_results = process.extract(
                                        token.text,
                                        lookup_data)

            print('fuzzy_results', fuzzy_results)

            for result, confidence in fuzzy_results:
                if confidence >= self.threshold:
                    entities.append({
                        "start": token.offset,
                        "end": token.end,
                        "value": token.text,
                        "fuzzy_value": result["value"],
                        "confidence": confidence,
                        "entity": result["entity"]
                    })
        except:
            print("An exception occurred")
                                     

    message.set("entities", entities, add_to_output=True)

config.yml

language: hu

pipeline:

  • name: SpacyNLP model: hu_core_ud_lg
  • name: SpacyTokenizer
  • name: custom_component.FuzzyExtractor
  • name: SpacyFeaturizer
  • name: RegexFeaturizer case_sensitive: False
  • name: CountVectorsFeaturizer
  • name: services.hun_date_extractor.HunDateExtractor
  • name: RegexEntityExtractor
  • name: EntitySynonymMapper
  • name: DIETClassifier epochs: 100 entity_recognition: false
  • name: FallbackClassifier threshold: 0.35
  • name: ResponseSelector epochs: 100

policies:

  • name: MemoizationPolicy
  • name: TEDPolicy max_history: 5 epochs: 100
  • name: RulePolicy

@Sierra your code is running? Just checking as you used FuzzyExtract.

Yes it is running. For example if the input is ‘szia’ (hi in hungarian) the output running rasa shell --debug is: entities [] message <rasa.shared.nlu.training_data.message.Message object at 0x00000261D8A30190> An exception occurred An exception occurred

So I assume that in these part of the code:

        try:
            tokens =  [t.text for t in message.get("tokens")]
            print('tokens', tokens)
        except:
            print("An exception occurred")

there is no token available, but I don’t know why.

@Sierra Did you have a idea of this

# use process.extract(.. limits = 3) to get multiple close matches

OR

process.extractOne. or

fuzzy_result=process.extractOne(token.text, lookup_data, socrer= fuzz.token_set_ratio)

if fuzzy_result > 80:
 rest code

You need to check the score of token.text and lookup_data, not the confidence I guess.

Please check the FuzzyWuzzy more in details. I just given you my initial observation and suggestion. I hope you get my point on this.

Thanks.

@nik202 Thanks for the recommendation.

You are right, I definitely have to reconsider to changing that later in my code.

My main problem is that I don’t reach that part of the code yet, because I can’t get any value. For some unknown reasons I don’t have tokens. It looks like SpacyTokenizer don’t work correctly, or maybe at all. Do you have any suggestions for that? Should I replace SpacyTokenizer or something else?

@Sierra Go with main configuration pipeline rather then Spacy.