Enhacing Rasa NLU models with Custom Components

Enhancing Rasa NLU models with Custom Components

Check out our latest tutorial on how to implement custom components and add them to your Rasa NLU pipeline! If you have added custom components to your Rasa NLU models, share your experience with us!

6 Likes

@Juste thank you for your post! It is really helpful!

I have a question. Is it possible to add a path to file with labels “labels.txt” to rasa_nlu.train command same as for “–data”? For example, add “–labels” option.

Or is there another way to specify file path (not inside of the script)?

hi I created a SentimentAnalyzer.py file that contains the code of my custom component. I overridden the attributes class SentimentAnalyzer (Component):
name = “sentiment”
provides = [“entities”]
requires = [“tokens”]
defaults = {}
language_list = [“en”]

this is my pipeline: language: “en” pipeline:

  • name: “nlp_spacy”
  • name: “tokenizer_spacy”
  • name: “SentimentAnalyzer.sentiment”

this error appears to me: Traceback (most recent call last): File “C: /Users/39392/PycharmProjects/starter-pack-rasa-nlu-master/training.py”, line 9, in <module> trainer = Trainer (config.load (“nlu_config.yml”)) File “C: \ Users \ 39392 \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ site-packages \ rasa_nlu \ model.py”, line 152, in init components.validate_requirements (cfg.component_names) , File, “C: \ Users \ 39392 \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ site-packages \ rasa_nlu \ components.py”, line 49, in validate_requirements from rasa_nlu import registry File “C: \ Users \ 39392 \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ site-packages \ rasa_nlu \ registry.py”, line 65, in <module> registered_components = {c.name: c for c in component_classes} File “C: \ Users \ 39392 \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ site-packages \ rasa_nlu \ registry.py”, line 65, in <dictcomp> registered_components = {c.name: c for c in component_classes} AttributeError: module ‘SentimentAnalyzer’ has no attribute ‘name’

why?

Hi @Juste,

Thank you so much for this tutorial, i was trying out this and got below error

File “C:\Users\ab56837\Desktop\ChatBot\New bot\Bot\sentiment_analysis.py”, line 29, in train with open(‘labels.txt’, ‘r’) as f: FileNotFoundError: [Errno 2] No such file or directory: ‘labels.txt’

I solved this error by using below code.

try:
        with open('labels.txt', 'r') as f:
            labels = f.read().splitlines()
    except:
        with open('labels.txt', 'w') as f:
            labels = f.write("")

now getting new error …

File “C:\Users\ab56837\Desktop\ChatBot\New bot\Bot\sentiment_analysis.py”, line 38, in train ** labeled_data = [(t, x) for t,x in zip(processed_tokens, labels)]** TypeError: zip argument #2 must support iteration

Please help me to implement this custom installation

Thanks,

Mohan

Hey @mohan. The error says that you need a .txt file with in your working directory. This file, based on the example provided in a tutorial should contain sentiment labels for the NLU training examples you use to train the NLU model. I tiny sinppet of how this file could look like is:

pos
pos
neu
neu
neg
neg

So, to replicate the provided example, you should create a label.txt file in your working directory and the sentiment labels for your NLU training examples. Give it a go and let me know if you still face issues.

1 Like

@Juste thank you so much for your response. it was silly mistake by me… I have implemented sentiment analysis successfully. Kindly have a look at this thread File upload and display document in chat window and do the needful, i couldn’t able to tag any members on that thread so… Thank you for your time.

1 Like

Hi,

I followed the tutorial and added an own custom component for detecting quantities and a general singular/plural classification to my pipeline. Just as the sentiment analyzer from the tutorial, my component creates new entities which get detected for every user input. This works well but I’m not sure whether this is the best way to do this kind of task. As I don’t highlight certain words in the input, the values for “start” and “end” of my entities are both set to 0 which seems to confuse the NLU trainer (warning message: “Make sure the start and end values of the annotated training examples end at token boundaries (e.g. don’t include trailing whitespaces or punctuation)”).

Also these entities are written into the nlu data generated from interactive learning, which is kind of an overload as this data does nothing for the nlu classification. I don’t really need the entities, I just want to have the information mapped to the corresponding slots for each user input, which perfectly works with slots having the same name. I still wonder if there is a better way of accomplishing this, also as I’m considering to add more custom components, which would eventually result in a rather confusing amount of entities for every user input.

Any ideas on this? Thank you very much!

Hi Alex,

Quick thought: would putting your custom entity extractors at the end of the pipeline work?

Hey @alex38. Apologies for such a late response. Could you give an example of what information you would like your custom components to handle? With quantities for example, I suspect simply using duckling would be much better and easier

Thank you for the suggestions @netcarver and @Juste! My current approach is to put my custom extractors at the end of the pipeline and extract entities only for specific intents instead of having them identified generally in every NLU output. This targets the problem of numerous entities in the training data with no start/end values and no relation to the actual data quite well. Also I currently do not plan to add further custom components, other than sentiment analysis and the singular/plural classification, therefore this solution might suffice for the future.

Continuing the discussion from Enhacing Rasa NLU models with Custom Components:

can I get a demo training data and lebels.txt for this sentiment.py

Create a custom class

  • Required function,Class and its params

class LocationExtractor(Component):
    Component --> from rasa_nlu.components import Component
  • def __init__(self, component_config=None):
     super("Give your class name", self).__init__(component_config)
     ---> initialization of the component
    

def train(self, training_data, cfg, **kwargs):
    training_data-- training data set in form of dataframe(pandas)
    cfg-- configuration
    ---> method which is responsible for training the component
def persist(self, file_name: Text, model_dir: Text) -> Optional[Dict[Text, Any]]:
    --->  method which will save a trained component on disk for later use

@classmethod
def load(
        cls,
        meta: Dict[Text, Any],
        model_dir: Optional[Text] = None,
        model_metadata: Optional["Metadata"] = None,
        cached_component: Optional["Component"] = None,
        **kwargs: Any
) -> "Component":
---> method responsible to load the model for component
def process(self, message, **kwargs):
    message-- get user text in message.text and in  message.set you will be able set the message to output

---- This the required classes to create a custom class

Steps to configure it in rasa_nlu

- create a folder under the main folder rasa_nlu example:-"custom_component"

- import the your custom class in registry.py

- add your class name in component_classes list of registry.py

- give alias name of class_name in old_style_names dictionary

-(optional) add your alias name in registered_pipeline_templates dictionary
    within the list of pretrained_embeddings_spacy or supervised_embeddings
    
- in config.py
   def component_config_from_pipeline(
    index: int,
    pipeline: List[Dict[Text, Any]],
    defaults: Optional[Dict[Text, Any]] = None) -> Dict[Text, Any]:
    
check for the exception of  index

Hello,

I have followed the tutorial on How to Enhance Rasa NLU Models with Custom Components | Rasa Blog | The Rasa Blog | Rasa and created a SentimentAnalyzer class and linked it in the config pipeline. When training the NLU module, the sentiment classifier is training, but when testing it does not output any sentiment label. My gues is that i have a wrong label.txt file format for my training dataset. I do not know the format of the label file that coresponds to a more complex training file.

Training file example:

language: “en”

pipeline: "spacy_sklearn"

# data - this is the combined .md files

data: |
  ## synonym:rbry-mqwu
  - hospital
  - hospitals

  ## synonym:9wzi-peqs
  - psychiatrist

  ## intent:goodbye
  - bye
  - goodbye
  - c ya
  - see you later
  - see ya
  - bye bye
  - cheers
  - Bye
  - Goodbye
  - See you later
  - Bye bot
  - Goodbye friend
  - bye
  - bye for now
  - catch you later
  - gotta go
  - See you
  - goodnight
  - have a nice day
  - i'm off
  - see you later alligator
  - we'll speak soon
  

  ## intent:ask_for_doctor
  - I want to talk with an specialist!
  - I want to speak with a doctor
  - I am feeling very sad
  - I have a very low self esteem
  - I want to die
  - I want to kill myself
  - I am not comfortable in my own skin
  - I cannot stand myself
  - I hate everyone.
  - Please help me, I want to kill myself
  - I am in great pain, please help me
  - Wabalabadupdup
  - I feel so lost, trapped and like everything is out of my control
  - Can you make my appointments?
  - I’m not coping.
  - Can we cancel going outside and stay in instead
  - I want to say I’m fine, but you know what? I’m really not
  - Today is not a good day for me
  - Can you text me instead of calling?
  - I would really benefit from some company
  - Can you make sure I get up in time?
  - I’m struggling to manage my self-care
  - I am feeling alone.
  - I do not want to live anymore.
  - I hate myself.
  - I am not feeling ok.

Can someone please help me with the format of the labels.txt file. Thank you

@Juste I wanna pick up a question that has been asked earlier but hasn’t been answered yet:

Is it possible to provide components with custom data without having to hardcode file paths into the component code? (Like the example does with the “labels.txt” file).

I have a custom component that reads extra fields defined in the JSON train files, but that’s hacky and doesn’t work properly in some cases. I think one way is to define paths in the config.yml file as component configs, but that turns the config from a static pipeline definition into a training-specific data definition. Is there a better way?

1 Like

Hi, The tutorial is very useful but I face a similar problem, and couldn’t solve it. I created a file sentiment.py in my project and added : - name: “sentiment.SentimentAnalyzer” to my pipeline. I also did “export PYTHONPATH=/path_to_your_project_dir/:$PYTHONPATH” But when I launch the training, I get that error message: “Exception: Failed to find component class for ‘sentiment.SentimentAnalyzer’. Unknown component name. Check your configured pipeline and make sure the mentioned component is not misspelled. If you are creating your own component, make sure it is either listed as part of the component_classes in rasa.nlu.registry.py or is a proper name of a class in a module.” Any idea? Thanks.

I solved my problem by writing the SentimentAnalyzer script directly inside registry.py (/lib/python3.6/site-packages/rasa/nlu).

I could train the bot and start rasa. A new slot “sentiment” is there and it can detect positive/negative/neutral messages from the user. But my other custom actions stopped working, like if the pipeline couldn’t extract/detect other entities from them. For example, the bot cannot give me the definition of a word any more when I ask it (I have created a custom action with a slot “definition” for that). I tried to change the position of SentimentAnalyzer in the pipeline, with no result. If I disable SentimentAnalyzer, everything works fine again. What should I do? How can I keep the sentiment detection and the other skills of my custom actions? Thanks.

I’ve put the “SentimentAnalyzer” like below in the pipeline and everything is now working: pipeline:

  • name: “WhitespaceTokenizer”
  • name: “SentimentAnalyzer”
  • name: “RegexFeaturizer”
  • name: “CRFEntityExtractor”
  • name: “EntitySynonymMapper”
  • name: “CountVectorsFeaturizer” analyzer: ‘word’
    OOV_token: OOV
  • name: “EmbeddingIntentClassifier”

Hi @Juste /@juste_petr / @ikenti / @akelad ,

I have built my custom component. When I put my component inside the rasa/nlu/ repository then it was working fine. But, When I pull out my custom code repository to the local directory it is not working. Also, I set the pythonpath according to my local directory.

Is it possible to call custom code first from outside of rasa and then It is going to call rasa nlu? Regards,

Hi @hardiksanchawat. If you have your component in your local directory, what error do you get? Also, can you include your config.yml file here so I can check how you reference your custom component?

Hi @Juste / @akelad

Please find below error and config.yml.

Exception: Failed to find component class for 'rasa.nlu.custom_component.custom_component_spellcheck.SpellCheckAnalyzer'. Unknown component name. Check your configured pipeline and make sure the mentioned component is not misspelled. If you are creating your own component, make sure it is either listed as part of the `component_classes` in `rasa.nlu.registry.py` or is a proper name of a class in a module.
sys:1: RuntimeWarning: coroutine 'BaseEventLoop.create_server' was never awaited

config.yml

language: en pipeline:

  • name: custom_component.custom_component_spellcheck.SpellCheckAnalyzer
  • name: supervised_embeddings

policies:

  • name: MemoizationPolicy
  • name: KerasPolicy
  • name: MappingPolicy