Rasa + AWS Lambda

Hello everyone,

I was wondering if anyone has any suggestion on how to deploy Rasa in AWS. More specifically:

  1. What are the best practices to use Rasa in AWS cloud services?
  2. Can we deploy Rasa in AWS Lambda? I’ve noticed that other people mentioned that it might be currently larger than what can be fitted into a lambda function. So I was wondering if anyone had any experience/workaround/suggestion on how to deploy Rasa in AWS Lambda.
  3. Is there any light version of Rasa that could be fit in Lambda functions, or any guidance to build one.


Best, Behnam

I think Lambda is a great idea. They are stateless which means very cost-effective, but also you’d need to implement one of the persistent data stores for your Tracker implementation, and, almost certainly, the custom one, connecting to DynamoDB. I’ve not seen any code immediately around that does this but it would be very useful.

Also, the default lambda behaviour is that the first request boots up a VM temporarily and keeps it around for about 10 minutes. So the first user of your service (be it human or a bot) will be 6+ seconds. The most efficient solution is almost certainly provisioned concurrency that keeps things warm for you at a slight increase in price. Very cool stuff.

Thanks for the reply. I agree that Lamda is usually a great choice and we can keep it warm to make it faster. However, the problem with deploying Rasa on a lambda is the lambda disk space limitation. The disk space is limited to 512MB and Rasa (at least the default version with no optimization) is way larger than that (~2GB). Therefore, I think Lambda may not be a good choice for Rasa deployment. Please correct me if anyone has any other experiences.

1 Like

Is there any way to cache some of the common conversation patterns so you do not need an actual rasa running, so you can use that time to actually start the rasa container, for eg you could have 0 rasa container running at any time but when you get a request you add container >0 and during the warm-up time you could reply through this caching functionality running on lambda?

Hi, would it be possible to deploy rasa(rasa nlu) in aws lamba as a docker container?

the container image code package size is 10 GB.

https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html https://docs.aws.amazon.com/lambda/latest/dg/lambda-images.html

We deploy Rasa on AWS for clients using AWS ECS with Fargate instances. We have also built pieces like AWS Batch training pipeline, S3 Tracker Store and SQS Event Broker so the Rasa instances are fully deployed in managed fashion.

The only advantage we get from deploying on Lambda is cost, but I think a 1core 2g Fargate instance is also manageable in terms of cost if the bot is not that intensive to run, and the service will be much more responsive.

However, I spent some time today reading through the source code of Rasa and the Lambda Runtime API, and I think it is doable, but it will take some hacking.

The main problem is that AWS lambda has its own runtime API and we need to essentially write our own lambda runtime and wrap around Rasa. When Rasa core starts the agent will be loaded and a Sanic server will start. The Sanic instance is likely not that useful in a lambda function, so we will need to bootstrap just the Rasa agent somehow for the lambda runtime. The Rasa agent needs to be warmed up at the time the runtime starts, so we need to expect quite a bit of warm-up time.

AWS has built a lambda runtime client using Python so I will likely hack around both the lambda runtime client and spin up Rasa agent directly.

We are also investigating other serverless offerings, I think if we want to save cost, Google Cloud Run would work since they offer a request time pricing similar to lambda, and it seems relatively easier to set up. I don’t have much experience with GCP so I cannot comment any further.

Our team has been doing open source work around Rasa and we will be doing this lambda experiment later this month and early next month. And hopefully I will get back with some findings then.

1 Like