Custom Sentiment Analyzer component based on DIET

Hi,

I aim to develop a custom sentiment classifier based on the DIET system. You may think I should use just DIET as an intent and entity classifier, but that is not what I am looking for. Please take a look at the sentiment classifier below:

from rasa.nlu.components import Component
from rasa.nlu import utils
from rasa.nlu.model import Metadata

from nltk.sentiment.vader import SentimentIntensityAnalyzer
import os
import nltk
nltk.download('vader_lexicon')

class SentimentAnalyzer(Component):
    """A pre-trained sentiment component"""

    name = "custom_component.SentimentAnalyzer"
    provides = ["entities"]
    requires = []
    defaults = {}
    language_list = ["en"]

    def __init__(self, component_config=None):
        super(SentimentAnalyzer, self).__init__(component_config)

    def train(self, training_data, cfg, **kwargs):
        """Not needed, because the the model is pretrained"""
        pass

    def convert_to_rasa(self, value, confidence):
        """Convert model output into the Rasa NLU compatible output format."""

        entity = {"value": value,
                  "confidence": confidence,
                  "entity": value,
                  "extractor": "sentiment_extractor"}

        return entity

    def process(self, message, **kwargs):
        """Retrieve the text message, pass it to the classifier
            and append the prediction results to the message class."""

        sid = SentimentIntensityAnalyzer()
        data = ""
        try:
            data = message.data['text']
        except KeyError:
            pass
        res = sid.polarity_scores(data)
        key, value = max(res.items(), key=lambda x: x[1])

        entity = self.convert_to_rasa(key, value)

        message.set("entities", [entity], add_to_output=True)

    def persist(self, file_name, model_dir):
        """Pass because a pre-trained model is already persisted"""

        pass

As you can see, we can make a custom classifier based on NLTK which results in sentiment entities for all intents even including nlu_fallback.

        entity = {"value": value,
                  "confidence": confidence,
                  "entity": value, // sentiment entity for all intents including nlu_fallback
                  "extractor": "sentiment_extractor"}

In this respect, I want to develop multinomial emotion classifiers, if possible, by using the DIET classifier. Then, the config would be like:

pipeline:
   - name: WhitespaceTokenizer
   - name: RegexFeaturizer
   - name: CountVectorsFeaturizer
   - name: CountVectorsFeaturizer
     analyzer: char_wb
     min_ngram: 1
     max_ngram: 4
   - name: custom_component.SentimentAnalyzer // based on DIET
   - name: DIETClassifier
     epochs: 100
     constrain_similarities: true
   - name: EntitySynonymMapper
     constrain_similarities: true
   - name: FallbackClassifier
     threshold: 0.6
     ambiguity_threshold: 0.1

Is there any way to use it for a custom component?

1 Like

It sounds like you want to “hack” entities, such that they may represent sentiment. Would I be correct to say that you’d like to use this information in a custom action later? There’s merit to the idea in the sense that DIET can only predict the intent and no other “tags” that you may want to attach. So in that sense, technically, could go about sentiments this way.

I would be very skeptical of the general performance of sentiment tools though. In my experience they are far from perfect. One issue is described in this algorithm whiteboard video on toxic language detection. What’s the use-case you have for sentiment in your assistant?