Enhacing Rasa NLU models with Custom Components

ufukhurriyetoglu · September 25, 2019, 11:56am

I have a question about the second example which is about the pretrained Sentiment Analyzer. We define sid = SentimentIntensityAnalyzer() in process. Will this instantiate an instance each time process called ?

Shak · September 26, 2019, 4:37pm

Hey @Juste

I was wondering if Custom Components could solve a problem that I’m having. I need an Entity that is specific to each user. For example, UserA may have an Entity “FamilyMembers” with [“John”, “Mary”, “Evgeni”] while UserB may have an Entity “FamilyMembers” with [“Leandro”]. Names such as Leandro would not be normally detected as it’s not a common name.

When a user of ours is connected to us. we would like to have the Entity limited to his data only. This way, we can guarantee that the name matches and is hopefully more accurate. DialogFlow calls this SessionEntity.

I was thinking a Custom Component could create a simple regex filter with the names of the family members. However, I’m not sure if there is a way to pass additional data to component so that it can match to the user’s family members. Is this possible or is there something else that could, at least, help us solve “unknown names”?

Thank you for a great blog btw.

MohamedLotfyElrefai · November 20, 2019, 12:36pm

@Juste im facing a problem while creating a sentiment which is using nltk only with custom component as it needs a defined of metadata of the model you can find the complete issue in this question

thanks in advance

yaput · January 22, 2020, 10:55am

Hi, can anyone help me with my sentiment, so i followed the blog post, and here is my code:

from rasa.nlu.components import Component
from rasa.nlu import utils
from rasa.nlu.model import Metadata

import nltk
from nltk.classify import NaiveBayesClassifier
from nltk.tokenize import word_tokenize # or use some other tokenizer
import json
import os

import typing
from typing import Any, Optional, Text, Dict

SENTIMENT_MODEL_FILE_NAME = "sentiment_classifier.pkl"



class SentimentAnalyzer(Component):
    """A custom sentiment analysis component"""
    name = "sentiment"
    provides = ["entities"]
    requires = ["tokens"]
    defaults = {}
    language_list = ["en"]
    print('initialised the class')

    def __init__(self, component_config=None):
        super(SentimentAnalyzer, self).__init__(component_config)

    def train(self, training_data, cfg, **kwargs):
        """Load the sentiment polarity labels from the text
           file, retrieve training tokens and after formatting
           data train the classifier."""
        self.training = []
        
        with open('./default_dataset_training.json', 'r') as raw_training_data:
            training_data = json.load(raw_training_data)
            print(training_data)
            neg = training_data['neg']
            for val in neg:
                self.training.append((val[0]['value'], 'neg'))
            
            pos = training_data['pos']
            for val_pos in pos:
                self.training.append((val_pos[0]['value'], 'pos'))

            processed_training = []
            for t in self.training:
                processed_training.append((self.preprocessing(word_tokenize(t[0])), t[1]))
                    
            self.clf = NaiveBayesClassifier.train(processed_training)



    def convert_to_rasa(self, value, confidence):
        """Convert model output into the Rasa NLU compatible output format."""

        entity = {"value": value,
                  "confidence": confidence,
                  "entity": "sentiment",
                  "extractor": "sentiment_extractor"}

        return entity
        

    def preprocessing(self, tokens):
        """Create bag-of-words representation of the training examples."""
        
        return ({word: True for word in tokens})


    def process(self, message, **kwargs):
        """Retrieve the tokens of the new message, pass it to the classifier
            and append prediction results to the message class."""
        
        if not self.clf:
            # component is either not trained or didn't
            # receive enough training data
            entity = None
        else:
            tokens = [t.text for t in message.get("tokens")]
            processed = self.preprocessing(tokens)
            pred = self.clf.prob_classify(processed)
            sentiment = pred.max()
            confidence = pred.prob(sentiment)

            entity = self.convert_to_rasa(sentiment, confidence)

            message.set("entities", [entity], add_to_output=True)


    def persist(self, file_name, model_dir):
        """Persist this model into the passed directory."""
        classifier_file = os.path.join(model_dir, SENTIMENT_MODEL_FILE_NAME)
        utils.json_pickle(classifier_file, self)
        return {"classifier_file": SENTIMENT_MODEL_FILE_NAME}

    @classmethod
    def load(cls,
             meta: Dict[Text, Any],
             model_dir=None,
             model_metadata=None,
             cached_component=None,
             **kwargs):
        file_name = meta.get("classifier_file")
        classifier_file = os.path.join(model_dir, file_name)
        return utils.json_unpickle(classifier_file)

Here is my config:

language: en
pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "sentiment.SentimentAnalyzer"
- name: "ner_crf"
- name: "ner_spacy"
- name: "ner_synonyms"
- name: CountVectorsFeaturizer
- intent_split_symbol: +
  intent_tokenization_flag: true
  name: EmbeddingIntentClassifier

When i tried to test my NLU model, i always get the same result:

{
      "value": "neg",
      "confidence": 0.696105702364395,
      "entity": "sentiment",
      "extractor": "sentiment_extractor"
    }

But when i tried to test the code with same training data:


from rasa.nlu.components import Component
from rasa.nlu import utils
from rasa.nlu.model import Metadata

import nltk
from nltk.classify import NaiveBayesClassifier
from nltk.tokenize import word_tokenize # or use some other tokenizer
import json
import os

import typing
from typing import Any, Optional, Text, Dict
from nltk.tokenize import word_tokenize
training = []

def preprocessing(tokens):
    """Create bag-of-words representation of the training examples."""
    
    return ({word: True for word in tokens})
        
with open('./default_dataset_training.json', 'r') as raw_training_data:
    training_data = json.load(raw_training_data)
    print(training_data)
    neg = training_data['neg']
    for val in neg:
        training.append((val[0]['value'], 'neg'))
    
    pos = training_data['pos']
    for val_pos in pos:
        training.append((val_pos[0]['value'], 'pos'))

    processed_training = []
    for t in training:
        processed_training.append((preprocessing(word_tokenize(t[0])), t[1]))
            
    clf = NaiveBayesClassifier.train(processed_training)

    while True:
        text = input(">")
        tokenize = word_tokenize(text)
        processed = preprocessing(tokenize)
        pred = clf.prob_classify(processed)
        sentiment = pred.max()
        confidence = pred.prob(sentiment)

        print(sentiment)
        print(confidence)

It is working fine. Can someone help me with this? Thanks

rideep · February 10, 2020, 9:37am

@Juste Hi I am facing an issue in implementing custom components in rasa_nlu. In config I have put name: “sentiment.SentimentAnalyzer”

The error that I am getting during training

Can you please help me on this? I am using rasa_nlu 0.15.0

Juste · February 10, 2020, 1:02pm

Hi @rideep. You shouldn’t put the custom components code inside the rasa_nlu package. The file should sit in your project (assistant’s) directory. Do you get the same error if you structure your project files that way?

ArivCR7 · March 22, 2020, 4:59pm

Anyone, please answer this question. How to load a model one single time and use it everytime?

samgpt · April 30, 2020, 9:58pm

Hi @Juste. I was implementing the sentiment analysis component but when training keep getting the error: AttributeError: module ‘rasa.nlu.utils’ has no attribute ‘json_pickle’. Running on rasa version 1.10.0 Tried to use python pickle module as well but no luck yet. Any specific thing that I am missing? For starters I just used the code in your blog post.

AttributeError: module ‘rasa.nlu.utils’ has no attribute ‘json_pickle’

flore · July 7, 2020, 5:16pm

Hi! @samgpt I read this right now, and I had the same issue trying to following the tutorial Enhancing Rasa NLU models with Custom Components… In rasa version 1.10.x try using this:

import rasa.utils.io as io_utils

and:

io_utils.json_pickle()

io_utils.json_unpickle()

This work for me. I’m using rasa 1.10.2

I write this here in case someone else needs. Regards!

flore · July 8, 2020, 12:44pm

Oops! I realized that there was a problem when the model is saved or loaded, and the SentimentAnalyser always gived to me the same answer! Finally I used pickle like suggest Collen here Sentiment analysis issue ! Need help please!

evilc3 · April 7, 2021, 12:02pm

Hey can a add 2 custom components my config file will look like this

name: WhitespaceTokenizer
name: component 1
name : component 2
name: RegexFeaturizer … `

whenever I do this only component 2 is used. what is this happening.

Also, the form stops working when I use the custom sentiment component.

souvikg10 · April 7, 2021, 1:01pm

With the custom components, make sure if it not is overwriting the object you want to spit out(tokens, featurizers or entities) it is possible that the second components initializes a new object and overrides the list object you want to append to.

happened to me when i introduced a custom entity extractor

an example would be

message.set(
            ENTITIES, message.get(ENTITIES, []) + extracted_entities, add_to_output=True
        )

Hanane · April 26, 2021, 12:46pm

Hi @Juste I am using rasa 2.0 i create a python package inside the rasa nlu folder with the name of customcomponent inside of it I create a file sentiment.py . in the registr.py i do an import : from rasa.nlu.customcomponent import SentimentAnalyzer , I add the name of the class SentimentAnalyzer, in the component_classes list and I add this component in the config.yml : - name: sentiment.SentimentAnalyzer but It gives me this error : ModuleNotFoundError: No module named ‘sentiment’ can you help me please ?

hemanthyernagula · August 1, 2021, 12:53pm

I’m trying to make custom component for my bot, I’m getting an error tokens = [list(map(lambda x: x.text, t.get('tokens'))) for t in training_data] TypeError: 'NoneType' object is not iterable

can anyone help me out, what I understood is each object in training data does not have anything like tokens

fkoerner · August 5, 2021, 7:08am

@hemanthyernagula it sounds like training_data is none… Can you check this?

abhi · August 12, 2021, 11:21pm

I am getting a similar error but when I check training_data it is not None.

fkoerner · August 17, 2021, 7:06am

@abhi are you following the tutorial? If not, could you make a new post for this question, so that we can keep this one from getting (even more) cluttered? You can feel free to tag me on the new post Otherwise, could you share what you have modified from the tutorial?

Fares · September 12, 2021, 1:46pm

hello @Juste So this what i did

create the sentiments.py
update my pipeline in config.yml
create the labels.txt

I have trained my bot but when i run it (rasa shell --debug) i get nothing and i can’t print the sentiment and the confidence? I’m very confused

shreehari · September 29, 2021, 8:19pm

Hello @Juste

Can you please confirm that the sentiment.py would go in the main project directory and not inside the actions folder?
Can you also tell me how can I check the rasa nlu output? I usually use the tracker.events. But if I have to check if my custom component has been called or giving me the right results, how can I check where the convert_to_rasa() is being added?
Could you also give more insight on setting the PYTHONPATH for my custom component to be picked up by rasa? by that I mean more step by step instruction on setting the PYTHONPATH… Does that mean i just do ‘which python’ and add that to my PATHONPATH and then add this to ~/.bash_profile ’ export PYTHONPATH=/path_to_your_project_dir/:$PYTHONPATH’ ?
The last question would be using huggingface models with ‘import pipeline’ inside custom components. Do you see any issues with that approach?

Looking forward to hearing from you. Thanks

baval · May 11, 2022, 4:17pm

Here is the latest blog regarding implementing custom components in rasa 3.0

Topic		Replies	Views
Training with custom components through HTTP API Rasa Open Source	2	540	April 18, 2019
Getting Custom Component to Work Rasa Open Source	8	1534	September 7, 2021
Trying implement custom component with sentiment nltk only Rasa Open Source	5	1494	August 14, 2020
Enhance rasa with a pre-trained classification model Rasa Open Source	4	2070	October 22, 2019
Custom sentiment analysis components issue with Core Rasa Open Source	1	1158	January 27, 2022

Enhacing Rasa NLU models with Custom Components

Related topics