Rasa NLU impossible to release memory

Rasa version : 1.10.3

Hi, I’m a beginner with rasa and i’m currently working on a project where I only need the rasa nlu parser to detect intents with several models. So I’m creating several instances of the object Interpreter, each with its specified model. I can do multiples predictions with multiples models with the parse method, it is working fine. In my case, I can’t afford to keep in RAM all the Interpreters because i will potentially need to load a lot of Interpreters, so i created a sort of buffer to supress and create Interpreters easily.

The problem is that I can’t release the memory used by an Interpreter which is no longer used. I tried to use the del function of python combine with a gc.collect() but it is not releasing the memory. For example with this script :

image

I obtain : (memory before and after the del):

[470.86328125, 470.86328125, 470.86328125, 470.86328125, 470.86328125] [470.86328125, 470.86328125, 470.86328125, 470.86328125, 470.86328125]

If some one has an idea …

[EDIT] : It seems that components always have more than one reference to them which ,if i understood correctly, explain the fact that the python garbage collector can’t release the memory affected to them… I don’t understand why, but there is a sort of ‘infinite chain of referenced objects’ : when i use gc.get_referrers on a component to get the list of objects which refers to it, i get another object (often a mysterious instance of the ‘frame’ class) , which itself have an object which refers to it, etc… Because of that, it is impossible for the garbage collector to release the memory used by the component. And it’s the case for every components used by the default pipeline.

The only workaround I find is to run interpreters on separate child process using the multiprocessing python librairie : when the child process (which run the interpreter) is stopped, all the memory is released. But i’m forced to run several process, and i wonder if i will not met the memory limit sooner with that kind of implementation…and it is quite complex… Is there really no way to ‘unload’ a component ?

1 Like

Hey, @pcloury I faced the same issue, in fact, I want to create a server which serves multiple rasa NLU models and reused rasa code (actually they used Sanic as server framework). To debug the effectiveness of Rasa when releasing memory, I rewrote their code in a simpler way (still use Sanic server and get rid of Interpreter.load, remove some nested classes) and found out:

The Memory problem is not of Rasa code (their code is quite stable I think), the problem is due to TensorFlow

Yes, Rasa uses DIETClassifier, which written in TF. DIETClassifier has a model called DIET (which basically a tf.keras.model). To my knowledge, TF enigneers do not focus much on garabge collection so the problem is still there.

NOTE: In your code, I think del interpreter is not enough, you should add

agent.interpreter.interpreter.pipeline.clear() # since pipeline is a List

del agent.interpreter.interpreter.pipeline

(I am using rasa 1.10.0, TF 2.1.0).

May be Tensorflow memory problem is the reason why Rasa removed serving multiple models in recent versions :slight_smile: