Hello, I am new to rasa, I want to create a chatbot which supports multiple languages, currently, we are starting with English and Spanish, later on we need to support French and Portuguese. We want to use rasa with rasa-x so any solution regarding multiple languages have to take that into account.
I was looking at the community discussion and so far the solution seems to be creating a separate NLU layer for each language.
The link provided in other discussions mainly pointed to chatfuel-rasa, N26 and whether bot.
Chatfuel rasa was a good starting point but will it be able to integrate with rasa-x, it will also result in lot of duplicate stories and intent creation, can the rework be avoided?
N26 presenter seems to solve the problem elegantly(especially the content management part) but I could not find any sample source code to make much sense of it.
The weather-bot discussion seemed promising but As I could see the language support code never made it into the main branch, I could also not find any example for MetaRasaNLUInterpreter or how it could be integrated. @alexissmirnov
Any sample application using two or more language as a starting point will be very helpful, I also need direction regarding the tweaks I will need to make it work with Rasa-X.
FYI: We needed to do additional processing for all our channel input and output so I am using callback channel for all communication, using self-generated sender_id which will be the same across the channels.
I am unfamiliar with chatfuel but one thing that may be relevant to observe is how the chatbot needs to have stories and responses for each language. It may be the case there is a cultural aspect to this as well. It may very well be that a chatbot in the UK needs to handle a conversation differently than a chatbot in the US (think about differences is zipcode standards or phone numbers alone) despite sharing the same language. I can imagine this cultural phenomenon contributes to the choice of seperating different langauges with different instances of a digital assistant.
I do not know if this is relevant in your use-case but figured it may be nice to mention as a consideration.
Chatfuel example above just uses it to distinguish between different language and redirecting them to appropriate url, one for each language.
I totally agree with you regarding the cultural aspect, and the eventual solution might be that we use different agent for each language I just wanted to know what is the best way to do it so that we can minimize the code duplicates plus content management.
It would be helpful if I can get a sample code for two bots which also support integration with Rasa-X.
I ended up creating a separate repository for my bot, and using a docker image in the main project. This way i am able to integrate the repository with RASA-X and using multiple repository for each language and integrating docker image for each repository in the main project we can support multiple language.
We are using rest and callback channel for all the communication, all the communication comes to our application and based on language we can hit different docker image as needed.
For multi-language support, we ended up with rasa-x deployment with one language on one instance using docker-compose, another instance for rasa only bots for a different language, and language-based request forwarding API.
We first detect the language and forward the request to the appropriate rasa instance either on rasa-x or onto rasa only bot.
The downside is that we now have two action instance as rasa-x action image is not exposed outside unless the port is opened ref
The other downside is we are only using rasa-x for a single language, to support the other language we will have to use language-specific rasa-x instances?
I am sure we can switch language by choosing a different language branch of git, but then the logic for language detection should point to correct language instance, so need to figure this out, this also leads to the question of user conversation history, if only we could use some tagging for all the language so that we could easily filter the language based on repository selected
I would be exploring all of the above in the coming weeks if anyone has a better idea how to approach this, please let me know , also it would be great if there was a way to better manage bot replica in a different language, currently, we have no other way than using translation API and human review
Yes with the approach I mentioned in the previous comment
Basically we went with two repository and one rasa-x instance, we are primarily using rasa-x instances to work with the Spanish language bot, all the development is done in this branch. When we update the bot we basically merge this Spanish branch with the English one using google translate(currently manually), our primary users are Spanish speaking so this has been working for now, but it’s not elegant and is error-prone.