REPOSTING AFTER EARLIER POSTING GOT MARKED AS SPAMMED AND REMOVED BY MODERATOR BY MISTAKE AND MAKING UPDATES WITH NEW FINDINGS
I have a full Flask app containerized in Docker which is calling an external NLU (also containerized). Everything works fine in dev mode (dev mode not containerized, only in Python virtual env), but in production the current_app.agent.handle(msg) hangs.
I have 3 policies (MemoizationPolicy(), KerasPolicy(), fallback) and it only hangs on the Keras one.
It seems this is an issue with running Keras predict() with Tensorflow backend in production Flask with uwsgi and might be related to the general issue of using Keras predict() in parallel mode:
So it must be about how Flask runs in production mode, likely about parallel (subprocess) running of multiple processes and trying to run Keras model.predict()
"When using Keras with TensorFlow backend in Flask async environment,
model.predict() is not working if the first call is in different thread from model loading.
This issue is not specific to deepcut, but for any async Keras+TF. Need a workaround until Keras handling this by itself. (Keras+Theono doesn’t has this issue)."
As such the solution could be to have the model loaded in the same thread as where the model’s predict() method is called.
In Flask I had to move my Rasa Agent setup code (which is using Keras with Tensorflow backend to predict) to the @app.before_first_request and then store the agent in the app variable (current_app) to fix this issue for my project.