the DIET classifier can no longer be considered state-of-the-art in comparison to GPT-4 and other LLMs that can be used for classification and entity recognition
Rasa somewhat lost its open-source character that was valued in the companies placement in the
Gartner Magic Quadrant for Conv. AI 2022 due to the monetization of Rasa X
What is Rasa’s edge over competitors, given those circumstances and the intransparent pricing (compared to other vendors)?
Hi, I tend to disagree with you. Rasa Open Source does not work like GPT, yes. The way Rasa works (using partially hand-crafted models) is very relevant for many commercial scenarios. GPT, to my knowledge, is neither accurate (it hallucinates) nor can you ask for an (accurate) explanation why it comes up with an anwer (no explainability; but Rasa is also far from great here). Therefore, when accuracy and transparency is critical for a scenario, you might not want to use GPT but Rasa. In addition, GPT is not an open source software (Open AI is not “open”) and inherently intransparent due to the “global” and “always online” language model. LLAMA seems to be some what better in the latter case, but explainability and accuracy likely remain a problem.
In regards of Rasa X, I think we miss open source alternatives for its features (not all but some like analytics, and basic bot management). If the community could ignite new projects here, I think it would strengthen the Rasa Open Source adoption.
What you’re referring to in your first argument is response generation. I agree with you on that point. However, I meant to compare the DIET classifier (which was once Rasa’s USP) and GPT in terms of classification capability. I’m sure that GPT has surpassed DIET in terms of accuracy, usability and flexibility.
Whether or not to use GPT for response generation depends on the use case and is not a bad idea per se, in my opinion.
I totally agree with your second point. I think that Rasa’s decision to end support for an open-source version of Rasa X made the whole product lose popularity of adoption and therefore the community in shrinking in terms of support.
Overall, I think we still need more responses to identify what remains Rasa’s competitive edge.
As far as I understand GPT, it does not classify intents. It transforms the users “string” input directly into a “string” output considering the conversation text (which is a series of string inputs/outputs). You could use it though to tell you which item in a list of intents (that you provide to GPT) fits best to a user utterance - if this is what you mean.
Since Rasa X is no longer usable by small projects due to the cost of the enterprise licenses, I started looking into this open source business application framework for Python: GitHub - Avaiga/taipy (https://www.taipy.io/).
It offers a simple GUI and a (more complex) analytics backend to analyze, for instance, Rasa trackers. I think with some effort one could recreate several functions of Rasa X with it.
I agree, I’ve trying so hard to get a clear pricing from the team, and I’m getting nowhere, it’ll be hard to build things for business if we have no clarity on pricing.
The main difference between systems like Rasa and GPT-X is that Rasa is a closed domain system and that one clearly specifies intents and entities - whereas GPT-X is an open domain system.
The intent and entity recognition of Rasa is deterministic and controllable - this doesn’t mean that it’s super but you can better test in advance. OTOH one also needs systems like Rasa-X or similar to continuously train and enhance the system for improving it or at least one needs access to all conversations so that one can extract relevant converation data out of it to improve NLU.
While I don’t say one doesn’t need anything like that for GPT-X, I’d say that it makes Rasa-X almost obsolete. You certainly want to know what’s going on inside your bot, i.e. review conversations, but there is no point in “improving” GPT-X NLU, as it’s a black box anyway. You also don’t need to serve or train models anymore.
If you need deterministic action handling, or don’t want to share the user conversation with OpenAI/Microsoft, Rasa still makes sense. E.g. in banking or health care or where ever you care about personal data.
As of April 13, this is what I received after poking them to give me some pricing for purposes of fundraising: " Our subscriptions are typically a multi-year engagement. Subscription pricing for Rasa Platform ranges from approximately $300,000 to multiple millions of dollars per year. We typically advise our customers to have a minimum budget of at least $300,000 per year in order to partner with us. Our customers will either start with purchasing Rasa Pro on its own, or combine it with Rasa X/Enterprise, our low-code UI, in order to get access to Rasa Platform for their conversational AI teams. It is important to note Rasa X/Enterprise cannot be purchased on its own. On the other hand, Rasa Pro will require a minimum budget of $150,000."
This was also my expectation regarding the services. The main issue with Rasa for us is the lack of support for low cost (compared to the +300k services) services. This includes services for simple low-code platform access. Initially, I thought that Rasa X would become an affordable tool for small and medium-sized teams too.
@lumpidu Do you really mean “it makes Rasa-X almost obsolete”?
Although I’m not an English native speaker, I understand by ‘obsolete’, that one doesn’t need Rasa X.
To me that contradicts with the determistic and controllable capabilities of Rasa, as you mention…
I meant, that you don’t need many capabilities of Rasa-X if you have a GPT-X based system.
You need foremost to review conversations that happened and probably you need to do prompt Fine-Tuning, Vector-search configuration interfaces to upload and index documents, etc. But most other functionality of Rasa-X is not necessary in combination with GPT-X.
You need basically another kind of admininstration Interface.
I do not agree with this view. Generative models are fundamentally different from intent-based models. The former is, in case of GPT, a symbol/text generator with no guarantees and explainability concerning the output (correct me if I miss the latest improvements). The latter is a grey box where the designer controls each response and under which conditions it occurs.
If your use case requires perfect control over responses, you can hardly use GPT to answer because it may answer inappropriately. You can modify the prompts to ensure that the generated response includes specific keywords that the designer controls but this has limits (and it is difficult to find them all).
Regarding Rasa X, you can also try botario which is way cheaper but still expensive for smaller projects. You can also create custom editiors with Taipy or Gradio in combination with the Rasa REST API. Rasa is still competitive, but the more features they hide behind the Rasa Pro paywall, the less attractive it becomes to me.
Our team built a chatbot for a client that handles 300k+ conversations a month before ChatGPT was released. The issue we saw with setting up Rasa bot or any intent-based chatbots is mainly cold start. You need a long and rigorous process to set up data and stories to make the chatbot work reliably. We had to spend months cleaning data (just for slot filling and NLG) Some clients we work with don’t even have good conversation designs or even properly classified intents even though they set up an intent-based bot.
ChatGPT or LLM in general replaces this warm-up process if you just set up the right prompts. Once the chatbot has run for a while in production with human in the loop, you can quickly generate good enough data and can move onto few-shot prompts and eventually Rasa.
With all that being said, Rasa is good to start and use if you have a strong dataset. Otherwise you are better off kickstarting with a foundation model.
I agree with you if the LLM supports intent classification and entity extraction. For that, I see concrete benetifts. But the decision logic for generating responses can (and in some cases should) work without and still create good answers - or does someone have other experiences?
Very interesting topic btw. In general I’d say that Rasa has a much bigger learning, configuration and maintenance curve/cost than GPT-X.
Also we can see that answering questions for some more complex scenarios can be much quicker accomplished by an appropriate prompt via GPT-X.
Example: if you want to answer questions about the cost of public swimming pools, there are a lot of different prices for kids, elder people, pupils, etc. depending on subscription intervals: monthly, yearly, only one ticket, 10 tickets at once, with or without sauna/gym etc. Doing this in Rasa with all the different possible questions/entities/stories/forms/slots is rather convoluted. You also need to probably write a custom action that takes the current price list into account, otherwise you’d have to train a new model for each new price list.
With GPT-X you just give the current price table with an appropriate prompt and the answers are (mostly) spot-on. And it’s also effortless to answer follow-up questions if you add all the conversation context. In Rasa you need to define several stories and think carefully about slots for possible follow-up questions and need to constantly monitor the system for those questions that you haven’t already put into your stories/rules.
The problem that we formest have with GPT-X: it’s damn slow ! It needs easily 5-10 seconds for some answers, though we have only tried the OpenAI API, not the Azure API yet, so I cannot say if that is in any way better.
If you think about it: making an entitiy extraction with turn-around of 5 seconds, then finally generating the answer with another 5-10 seconds … the user/customer needs a lot of patience here to get an answer. This might probably change over time to the better, but currently that is really a limiting factor.
Another factor is also that you need to be carefully testing your prompts so that GPT-X doesn’t hallucinate “facts”. This means one also needs to test and run these test often enough because OpenAI does “change the models under your feet”.
We have been strictly using OpenAI in closed domain chatbots since 2023 and we have been building framework that controls the LLM decision making and flexibly changing prompts based on dialog state. There is a lot of prompting work and state modeling we need to do, but the results are good in production. Now we have a good batch of samples, we are trying to further improve the accuracy by using embedding retrieval of past conversations and let LLM follow the examples.
With human in the loop I can see this method very competitive. In reality none of us can remove all humans in the customer support or sales setting, why not keep them in and let the AI learn from them?
Have you guys tried promptai.us, which provides a free tool to annotate dialog flows and can automatically generate rasa code and stories? It might be a good alterative to Rasa X. Both DIET and Form are great ideas. However, before it is too late, they need to integrate pretrained LLMs soon so that the annotation cost can be reduced and the flexibility can be improved.