I want to ask about the hit limit on RASA core http server, when I use a large model, I occur a hit limit on RASA REST input channel, it can only process around 20 - 25 request per second, but when I use a smaller model, it can process 100 request per second, is this normal? do i need a much more powerful machine to process this large one? thanks in advance.
approx., the large story model is 200 times larger than the small one ( since we made the small one manually, and the large one automatically, using script )
- edit, now when we test it using vegeta, it can handle up to 2100 request per minute for the large model, is this normal?