Text-to-Speech

Hey !

I do not manage to find any relevant information about this subject, what is the best practice to include a text-to-speech engine inside rasa core ? I got my own TTS engine, I just want to switch between textual feedback to a synthesize one.

  • Should I implement this feature inside my action.py ?

  • Should I write my own dispatcher (via OutputChannel)?

Thx in advance,

Alex

1 Like

Running into the same issue. Please let me know if you find any answer. Thanks.

Sure, I’m currently back engineering the source code! I’ll let you know

1 Like

Asokolow,

Thanks for quick reply. Which technology are you using for TTS? Are there any open source tools for Text-To-Speech and Speech-To-Text? Thanks.

There are several and they are getting better, but they don’t come close to Google quality yet.

hari,

  • For the ASR (speech-to-text) : Last summer, my job was to build an embedded speech recognizer that could run on a Jetson TX2 with realtime capability. My work was inspired by a mix of DeepSpeech (Mozilla) and Wave2letter (Facebook) architecture. If you are new to the subject, I suggest to use Keras as it has a “friendly” programming interface. There are several implementations of such architectures if you googleit!
  • For the TTS (text-to-speech) : I’m currently using a solution from a sister company called Acapela. If I had to build my own, the best implementation is called “Tacotron 2” from google. I think there are some implementation on github.

Regards,

Alex

Thank you Asokolow. I’m pretty much new to ASR & TTS technologies. Will look into Keras & Tacotron 2. My requirment is to build a ChatBOT with speech cabalities which would need to be integrated into Angular 6 web application.

If your targeted application is web-based, I suggest to have a look at TensorflowJS instead of Keras! Tensorflow is a bit more complex than Keras but it is worth every penny (a.k.a time)!

Thanks Alex