We want to build smart speaker which can chat with Chinese students in English, and have following questions:
Can all models (ASR, NLU, NLP, TTS…) be installed in one server, needn’t connect with other server(such as Google, RASA, Amazon…), as many servers are blocked by China government.
RASA is free?
We want speech speed can be adjusted according to request, such as “pardon”, " speak slowly"…, how to realized it in RASA?
Can it support both Chinese and English at the same time?
Yes there are several open source models for ASR(Mozilla Deep speech) → Rasa(NLU, Dialogue), you will train a few models using it and it is open source → TTS (Text to Speech) Mozille Deep speech.
I think for ASR there is also Nvidia Riva but i haven’t tried it myself so i can’t say if they are open source
also since you are trying to build english bot so you should be fine with open source models
yes rasa is open source while rasa X is a community version and there is an enterprise offering as well
This looks mostly the job of how well you manage to transcribe english spoken by non-native english speakers andn provide as an input to Rasa. I think your challenge or infact for any ASR is actually understanding english from non native speakers. My alexa at home often confuses my english as well. so this is quite a challenge in general with ASR. if you are able to manage to transcribe it properly then you will have an intent in Rasa called adjust_speech_speed and then an action that can inform the TTS service to configure the speed of reply. Rasa can orchestrate this but this is mostly a challenge in ASR to transcribe the speech correctly and your TTS to adjust accordingly
Rasa supports Chinese with a different tokenizer like Jieba while english is whitespace tokenizer and the standard configuration. I know there is a large group of Rasa enthusiasts for Chinese(@howlanderson can help you i guess), I can’t say much about ASR and TTS for chinese