How can i make multi language rasa chatbot having at least two languages?

i want to make a rasa based chatbot with at least two languages or multi lingual chatbot. Can anybody tell me possible way of making it.

2 Likes

Well, I know two possible ways of doing it: 1.- You train X models, one for each language u want, then you load X agents and use a language detector to choose which agent you hould send the message to, or instead of a language detector, you can send the message to the agent that gives you the highest confidence for the input of the user. This would look like this:

async def get_response(user_input):
    # If english or spanish, check confidence from interpreters
    # Sometimes language detected not correspond to input: "cual es tu color favorito" is detected as english
    confidence_es = (await interpreters['es'].parse(user_input))['intent']['confidence']
    confidence_en = (await interpreters['en'].parse(user_input))['intent']['confidence']

    lang = 'es' if confidence_es > confidence_en else 'en'

    # Info interpreted, with intent detected & intent_ranking
    interpreted_info = await interpreters[lang].parse(user_input)
    # Getting intent information
    intent = interpreted_info['intent']
    intent_name = intent['name']

    # Bot answer
    answer = await agents[lang].handle_text(user_input)
...

2.- You train only an english model (is the language it works better in general) and you work like this:

detect language of the user msg -> translate to enlgish -> handle message with rasa -> translate answer

The second option is less flexible cause you will have the same model for each language but in return you only have one instance of an agent. The first one gives you the option to have a diferent model for each language so you could have diferent kind of conversations and dialogue, maybe you only want to have some general information in every language, but then specialized stories in diferent ones.

The first solution can also use language detector as I said, but sometimes, there are words in the languages that are pretty similar and you may have a worng output from the lang detector. In my case for example I had a problem asking for the favorite color, cause in spanish and english are quite similar and it always detected ‘en’, that’s why I ended using agent confidence. You must be sure that your data is complex enough so the agents don’t mess up with the confidences, use specific words of each language to make a clear distinction.

I hope it helps you Good luck

1 Like

Hi, I have created a simple Rasa bot with English as the language support. I wanted to know whether it is possible to create a bot with support for another language say, French. Is it possible to create a bot where the user is able to choose the language preference initially and all the following conversations should take place in that particular language?

Thanks.

3 Likes

Hi @AlvaroMonteagudo

Can you please explain more the second option ?

Thanks

@akelad please answer this query, It’s such a big issue rn after so many days

@bamwani if the second language is English then try using the config file for the other language alone, as in don’t add the English configuration at all and see if that works.

What I did to solve that is creating a language slot. When the user says he wants to change the language, the bot will give him 4 buttons to choose from, one for each language.

Then every response is a custom action, that checks the value of the language slot and returns the text accordingly.

Those are the functions I wrote for language detection:

lang_list = ['English', 'French', 'Arabic', 'Armenian'] # Same as slot values, language is a categorical slot

def get_lang(tracker):
    lang = tracker.slots['language'].title()
    return lang

def get_lang_index(tracker):
    return lang_list.index(get_lang(tracker))

# dispatcher.utter_message(text = ...)
def get_text_from_lang(tracker, utter_list = []):
    lang_index = get_lang_index(tracker)

    if not utter_list: # No text was given for any language
        utter_list.append('[NO TEXT DEFINED]')

    if lang_index >= len(utter_list): # No text was given for current language
        lang_index = 0

    return utter_list[lang_index]

# dispatcher.utter_message(template = ...)
def get_template_from_lang(tracker, template):
    return template + '_' + get_lang(tracker)

# dispatcher.utter_message(buttons = ...)
def get_buttons_from_lang(tracker, titles = [], payloads = []):
    lang_index = get_lang_index(tracker)
    buttons    = []

    if lang_index >= len(payloads): # No text was given for current language
        lang_index = 0
    
    for i in range(min(len(titles[lang_index]), len(payloads))): # Build each button
        buttons.append({'title': titles[lang_index][i], 'payload': payloads[i]})

    return buttons

Here are examples uses:

When the user wants to change the language, the bot will speak according to get_text_from_lang and give 4 buttons. When using this function, the responses are hardcoded in the code.

text = get_text_from_lang(
    tracker,
    ['Choose a language:',
    'Choisissez une langue:',
    ':اختر لغة',
    'Ընտրեք լեզու ՝'])

buttons = [
    {'title': 'English',  'payload': '/set_language{"language": "English"}'},
    {'title': 'Français', 'payload': '/set_language{"language": "French"}'},
    {'title': 'عربي',     'payload': '/set_language{"language": "Arabic"}'},
    {'title': 'հայերեն',  'payload': '/set_language{"language": "Armenian"}'}
]
       
dispatcher.utter_message(text = text, buttons = buttons)

When the bot should ask about a service type, it will choose a template/response defined in the domain using get_template_from_lang and will choose which titles to display according to the language using get_buttons_from_lang.

When using get_template_from_lang, the domain should define get_template_from_lang_English, get_template_from_lang_French, etc., each containing multiple responses, and one will be taken at random like any normal utterance.

When using get_buttons_from_lang, you provide a list of lists of titles in each language, followed by a list of payloads (each language has the same payload).

template = get_template_from_lang(tracker, 'utter_ask_service_type')
buttons  = get_buttons_from_lang(
    tracker,
    [['Wireless', 'Internet', 'DSL Internet', 'CableVision TV'],
    ['Sans Fil', 'Internet', 'Internet DSL', 'CableVision TV'],
    ['لاسلكي','إنترنت','DSL إنترنت','تلفزيون الكابل'],
    ['Անլար', 'Ինտերնետ', 'DSL ինտերնետ', 'CableVision TV']],
    [
        '/inform_service_type{"service_type": "wireless"}',
        '/inform_service_type{"service_type": "internet"}',
        '/inform_service_type{"service_type": "dsl"}',
        '/inform_service_type{"service_type": "cablevision"}'
    ])

dispatcher.utter_message(template = template, buttons = buttons)