High memory usage when using RASA NLU as an HTTP Server

parthsharma1996 · August 27, 2018, 11:51am

I am using currently using two RASA HTTP servers with the spacy_sklearn pipeline

Both servers have around 4-6 projects each ( each project only has one model).

It seems like each time I make a call to a new project spaCy vectors are getting loaded into the memory. The reasons for my suspicion are

The first time that I query a new project it take ~ 1 minute to respond
Also my RAM Usage increases by ~1 GB with each such model

With all the models loaded, my RAM (8GB) overflows into the swap space and basically my computer become unusable.

I remember that when running an interpreter from python there used to be component builder option which would cache spaCy vectors between different interpreters. Why does that not happen automatically with the HTTP Server?

akelad · August 29, 2018, 1:58pm

@tmbo any ideas about this?

tmbo · August 29, 2018, 2:29pm

That should happen with the HTTP server as well. But I think the server might use multiple processes, so it will load the model once in every process. If you hit the server with train calls, that will also load the space vectors again as they are run in a process pool.

parthsharma1996 · September 1, 2018, 9:10am

So what you suggest I do? I tried switching to the tf pipeline but it mis-classifies more often due to the limited number of training examples.

Topic		Replies	Views
Memory Issues Getting Started with Rasa	0	178	December 17, 2018
Rasa Eats Memory. Is Garbage been handled? Rasa Open Source	9	1466	December 20, 2019
Memory Utilization Rasa Open Source	5	1281	July 26, 2019
Reduce RASA model memory consumption or load time Feedback on Rasa Open Source	10	2083	December 15, 2021
Memory issue in training Getting Started with Rasa	0	120	December 20, 2018

High memory usage when using RASA NLU as an HTTP Server

Related topics