Excited to share our latest project: the successful integration of Asterisk (VoIP), Rasa chat-bot, and Yandex SpeechKit!
Last week, we achieved a milestone by integrating Asterisk, our VoIP system, with Rasa and implementing Yandex SpeechKit for speech interaction.
Asterisk (VoIP) and Rasa Integration: We’ve successfully enabled seamless communication between Asterisk, our VoIP system, and Rasa, unlocking more efficient and intelligent conversation scenarios with our users. This integration empowers our chatbot to work seamlessly with our phone system, allowing for intricate and personalized communication pathways.
Yandex SpeechKit Integration: By leveraging Yandex SpeechKit, we’ve introduced speech recognition and synthesis capabilities. Users can now interact with our chatbot using their voice, elevating the user experience and making the interaction process more convenient and accessible.
Results:
Improved user interaction experience
Greater flexibility with personalized communication scenarios
Expanded accessibility for users with diverse needs
We take pride in the outcomes of our project and are eager to share our experience with the Rasa community! If you have questions, want more details, or are interested in discussing use cases, please feel free to ask in the comments.
Can you share, how you implemented the turn-taking ? E.g. have you used something like VAD to detect when the speaker is speaking ? What about concurrently talking from both sides ? Are there any rules applied ?
The bot analyzes the presence of voice in the incoming audio signal. This process is implemented using Asterisk resources. Each frame of the signal, lasting 20 milliseconds, is evaluated for energy in Asterisk terminology. The decision on the presence of voice is based on exceeding a set threshold, the value of which is constantly determined in the configuration, taking into account previous experiments.
Level 2
The bot determines the beginning and end of a phrase. The decision on the presence or absence of voice is made based on the duration of silence over a specified period.
Level 3
The bot processes the speaker’s phrase, including recognition and preparation of responses using RASA. This process is independent of whether the bot is currently broadcasting. RASA response preparation may take into account simultaneous broadcasting.
In case of simultaneous broadcasting, two behavior policies are implemented:
The bot continues broadcasting regardless of the speaker.
The bot stops broadcasting.
Option 1
The choice of the option is made during the bot’s launch through configuration.
Option 2
The choice of the option is made by a RASA script depending on the content of the received message.
Unfortunately, we are not ready to release the project yet, as it is heavily dependent on technical implementation. If you have specific tasks, we are ready to discuss them.
We can try implementing both English and Spanish languages; it all depends on the quality of the STT (Speech-to-Text) and TTS (Text-to-Speech) services. Thank you for the link. it’s an interesting project. Do you already have any developments?
Hello Sir,
Firstly, I would like to congratulate you for your milestone. Even I am working on same kind of project. I have created a rasa voice Bot using faster-whisper for speech to text and Sillero for text to speech. We would like your guidance for this project in connecting asterisk with our rasa voice Bot. We are based out in India and would appreciate your guidance.
Thank you
Excited to share a new experience from our team about the successful testing of our auto-dialer system!
As part of the integration of VoIP telephony Asterisk and Rasa chatbot, we conducted a pilot operation at a district clinic. The task was to automatically inform patients about upcoming appointments and process their responses.
Key stages of the testing:
A staff member exported an Excel file from the medical system with information about upcoming appointments, made necessary adjustments, and uploaded it to cloud storage.
Our system processed the file and initiated the auto-dialer.
Upon task completion, the responsible staff member received an email with call results, including the confidence level of the response accuracy.
Call flow and response options:
Robot: “Hello! You have an appointment with Dr. [Name] at [time]. Will you be able to attend?”
Patient: “Yes, I will.”
Robot: “We look forward to seeing you. Please arrive 10 minutes before your appointment.”
Patient: “No, I can’t make it.”
Robot: “You need to cancel your appointment in your personal account. Do you need assistance?”
Patient: “No, I don’t need help.”
Robot: “Thank you, goodbye.”
Patient: “Yes, I need help.”
Robot: “A clinic staff member will contact you today, please wait for the call.”
Our robot successfully recognized key phrases of patient agreement and disagreement, as well as identified mobile operator voicemail systems during the call.
Testing results:
Conducted 10 auto-dial campaigns, each with 1500-2500 numbers.
Number of simultaneous lines used: 15 to 20.
Average response rate: around 75%.
Average call duration: 30-45 seconds.
Average task completion time: about 2 hours.
Detailed results:
50% confirmed their appointment.
20% calls were identified as going to voicemail.
10% responses were not recognized (mostly due to voice assistants).
15% declined the appointment and did not need assistance.
5% requested a callback from a clinic staff member.
The response rate and patient engagement were high, even considering the significant number of older patients. Clinic staff also positively evaluated the call results, and the ability to review call transcripts allowed for improvements in the auto-dialer performance.
Unfortunately, the client cannot provide metrics on appointment attendance changes, as this information is sensitive. Moreover, implementing the service is not planned at this time due to budget constraints, but there is an intention to seek funding for the next year.
Nonetheless, we successfully tested and improved our service, and the potential client recognized its advantages.
If you have any use cases or are interested in possible collaboration, we would be happy to discuss!
We’re excited to share that we’ve upgraded our voice bot, which runs on the Asterisk VoIP platform and uses the RASA chatbot server, by switching from Yandex SpeechKit to Google Speech-to-Text!
This change makes our bot better at understanding and responding to what people say, with more language options to boot. It’s all about making things easier and faster for everyone who interacts with us.
What Speech-to-Text tools do you use or recommend? Let us know in the comments!