Can I have my own training or retrieving algorithms

Hey guys, I am just new to Rasa. I am doing a FAQ chatbot project using Rasa. I built a chatbot demo under Rasa framework (with actions.yml, nlu.md, stories.md, etc.) However, my said it is too simple and the performance seems not good enough. He aslo suggestted me to build my own training or retrieving algorithms. I dont know how can I get started since I dont know the way to modify the training code. Each time I train the model, I simply prepare the data use rasa train. It seems that I can do nothing with the Rasa core algorithms. I’ve done some machine learning and data retrival courses and know some algorithm that potentially can be used in this project.

Could anyone give me some idea?

What exactly do you mean by retrieving algorithms?

Rasa is divided into two parts: NLU and Core. The NLU part is processing the message and extract entities and classifies the intent. You can define the components that you want to use for it, see Choosing a Pipeline and Components. If you want to write your own Entity Extractor or Intent Classifier, you can do so by writing your own component (Custom NLU Components). The Core part is responsible for predicting the next action. We have several policies in place, see Policies.

Does that help?

1 Like

Thank you. It helps me a lot. Btw, if I am not satisfied with the performance of the default rasa, which part should I consider to improve (add custom component) first? Or say what component added benefits the most?

It depends. What part are you not satisfied with?

If you are not satisfied with the overall performance of your bot, I would recommend that you test your models to see what part is not working well. See Evaluating Models for that. You can also run your bot in --debug mode to see what the bot is predicting (entities, intents, actions) and what does not work that well.

The easiest way to modify your bot is to change the NLU pipeline. You can try out different components and configure them according to your needs (Components). This will have impact on your NLU model. If the NLU model is more certain the bot might also make better prediction on the next action.

1 Like

Now I got the whole picture. Thank you so much.

I tried to add the sentiment analyzer to my pipeline. Basically it works. However, when I run rasa shell nlu and input some messages which are the same as the sentences in nlu.md, the sentiment analyzer always gives me the same result and same confidence, no matter what I input.

I checked the input data and labels in train(), they zipped correctly. And the input message in process() is passed and preprocessed correctly as well.

Where would be the problem?

Hard to tell without seeing the code. Can you share it?

What is your model returning? E.g. in process(): how do you save the result of the sentiment analyzer?

The problem has been fixed by saving the model in disk and read it in process(). The code was exactly same as the sentiment analyzer of Rasa’s tutorial. The model was just saved in the class as a variable. I am not so familiar with the mechanism of Python, and I think maybe the reason is that once the train process is finish, the model in memory is free since the program terminated. Then the Rasa framework loads something wrong when the process() is called.

Anyway it works now,

In the next step I want to train my model with the output of some featurizer or tokenizer in my pipeline. However, I still recieve the raw text input in the train(…,training_data,…) ,even if I add my requirements in requires=[]. How can I receive the output of other components?

You need to specify the components in the pipeline (Choosing a Pipeline) in your config.yml.

The training_data object you receive in train() contains a list of message. A message holds a dictionary. Each featurizer puts its output under a specific keyword to that dictionary. To obtain the output of a specific featurizer you need to call message.get(<key>).

The CountVectorsFeaturizer, for example, provides text_features, response_features and intent_features (rasa/count_vectors_featurizer.py at master · RasaHQ/rasa · GitHub, rasa/constants.py at master · RasaHQ/rasa · GitHub).

The last problem was solved with the help of you. Thanks again. Now I am using the Rasa knowledge base. However, the bot keeps let me to rephrase if there is a knowledge base query. I am sure that the query was post to the action server correctly, which means nlu model works. Where would be the problem? I did a slightly change on the knowledgebot in examples. Here are the code:

domain

nlu.md:

knowledgebase.json

actions.py image

client program

image

Can you run your bot in debug mode (--debug) and paste the log from Rasa and the action server here?

Here it is.

It seems like that your sentiment extractor overwrites the entities extracted from the CRF entity extractor. Can you please check that? The knoweldge base action depends on the entities. If no car_insurance entity is found, it cannot query the knoweldge base. The line Received message ... indicates that there is no `car_insurance´ entity.