I am using currently using two RASA HTTP servers with the spacy_sklearn
pipeline
Both servers have around 4-6 projects each ( each project only has one model).
It seems like each time I make a call to a new project spaCy
vectors are getting loaded into the memory. The reasons for my suspicion are
- The first time that I query a new project it take ~ 1 minute to respond
- Also my RAM Usage increases by ~1 GB with each such model
With all the models loaded, my RAM (8GB) overflows into the swap space and basically my computer become unusable.
I remember that when running an interpreter from python there used to be component builder option which would cache spaCy
vectors between different interpreters. Why does that not happen automatically with the HTTP Server?