Restrict the use of memory GPU for inference in rasa models

I have a custom rasa chatbot in the Spanish language with the Spacy model and EmbeddingIntentClassifier also the chatbot has a KerasPolicy with LSTM. My problem is the model uses all of the memory of the GPU for inference. I look at solutions for restricting memory growth and found this page Use a GPU  |  TensorFlow Core but I don’t know how to implement these solutions in rasa.

I appreciate your help.