RASA multilingual chatbot - only NLU or complete chatbot?

gabriel-bercaru · June 20, 2022, 1:02pm

Hello,

When developing a chatbot meant to converse in a language other than English, is it desired to implement the NLU component (user intent identification) in English and the chatbot replies (the utterances defined in domain.yml) in the target language?

My reasoning would be that in English, one can easily make use of pretrained embeddings from transformer language models such as BERT or GPT in order to classify intents. However, in the target language, especially if it is a low resource language, the corresponding transformer models might cause the intent classification accuracy to go lower.

Ultimately the approach would be to have the input text in the low resource language, translate it to English, have the NLU component predict the intent (given the English text) and then reply accordingly. However, do the replies also have to be in English (and then translated back to the source language)? Is there any issue with coding the chatbot replies directly in the source language?

Thank you!

stephens · June 29, 2022, 4:26am

I recommend you create a single repo and produce a separate model for each language. Try sharing rules/stories but separate directories by language for the nlu/intents. This would require separate rasa train commands that pull the language specific nlu training data but use the common rules/stories.

Jawahar96 · November 15, 2023, 9:51am

Could you kindly explain how to share the rules and stories with the nlu data? currently i am having to duplicate the stories and rules inside each subdirectory for the languages.

stephens · November 15, 2023, 5:51pm

Since that original post, I would first try a single model for a multilingual bot but separate the data directory by language.

currently i am having to duplicate the stories and rules inside each subdirectory for the languages.

You could avoid this. For example, if you have french & Spanish, setup a directory structure like this:

/data
  /common
    /domain
    /nlu
    /rules
    /stories
  /french
    /domain
    /nlu
  /spanish
    /domain
    /nlu

Use a build script in your Makefile & CI/CD pipeline that creates a temporary build directories, one for spanish and another for French. Both builds use the /common contents and then pull from the language specific directory.

Run the rasa validate/test/train commands against the two build directories to create the two models.

Jawahar96 · November 16, 2023, 8:29am

Ok got it. Let me try this out then.

While I have you here, could you help me figure out another thing I am struggling with. So i want to set the language of the chatbot the language the user is currently in within the website. I am trying to send it via the initPayload from the webchat and trying to read it in the action_session_start.

class ActionSessionStart(Action): def name(self) → Text: return “action_session_start”

@staticmethod
def fetch_slots(tracker: Tracker) -> List[EventType]:


    slots = []
    for key in ("language"):
        value = tracker.get_slot(key)
        if value is not None:
            slots.append(SlotSet(key=key, value=value))
    return slots

async def run(
  self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: Dict[Text, Any]
) -> List[Dict[Text, Any]]:

    events = [SessionStarted()]
    events.extend(self.fetch_slots(tracker))
    events.append(ActionExecuted("action_listen"))

    return events

and the initPayload: ‘/greet{“language”:“de”}’. is this a feasible solution? or is there a better way to handle this?

Topic		Replies	Views
Create multilingual bot / N26 way to do it Rasa Open Source	1	1645	August 3, 2020
Multilingual Bot Rasa Open Source	2	722	December 10, 2019
Multilingual Bot with Training Data in Only English Language Rasa Open Source	3	50	February 18, 2025
Muti language support using rasa Rasa Open Source	6	2323	August 26, 2021
Building a bot for local language Rasa Open Source	17	2022	September 23, 2024

RASA multilingual chatbot - only NLU or complete chatbot?

Related topics