I do not manage to find any relevant information about this subject, what is the best practice to include a text-to-speech engine inside rasa core ? I got my own TTS engine, I just want to switch between textual feedback to a synthesize one.
Should I implement this feature inside my action.py ?
Should I write my own dispatcher (via OutputChannel)?
For the ASR (speech-to-text) : Last summer, my job was to build an embedded speech recognizer that could run on a Jetson TX2 with realtime capability. My work was inspired by a mix of DeepSpeech (Mozilla) and Wave2letter (Facebook) architecture. If you are new to the subject, I suggest to use Keras as it has a “friendly” programming interface. There are several implementations of such architectures if you googleit!
For the TTS (text-to-speech) : I’m currently using a solution from a sister company called Acapela. If I had to build my own, the best implementation is called “Tacotron 2” from google. I think there are some implementation on github.
Thank you Asokolow. I’m pretty much new to ASR & TTS technologies. Will look into Keras & Tacotron 2. My requirment is to build a ChatBOT with speech cabalities which would need to be integrated into Angular 6 web application.
If your targeted application is web-based, I suggest to have a look at TensorflowJS instead of Keras! Tensorflow is a bit more complex than Keras but it is worth every penny (a.k.a time)!