Model.predict hangs for Keras model in Flask with uwsgi (repost)

REPOSTING AFTER EARLIER POSTING GOT MARKED AS SPAMMED AND REMOVED BY MODERATOR BY MISTAKE AND MAKING UPDATES WITH NEW FINDINGS

I have a full Flask app containerized in Docker which is calling an external NLU (also containerized). Everything works fine in dev mode (dev mode not containerized, only in Python virtual env), but in production the current_app.agent.handle(msg) hangs.

It turns out that the Keras model’s predict() method hangs, but only when running it in production with Flask and uwsgi. Similar to what is described here: python - keras prediction gets stuck when deployed using uwsgi in a flask app - Stack Overflow

It hangs at this line: https://github.com/RasaHQ/rasa_core/blob/master/rasa_core/policies/keras_policy.py#L199

I have 3 policies (MemoizationPolicy(), KerasPolicy(), fallback) and it only hangs on the Keras one.

It seems this is an issue with running Keras predict() with Tensorflow backend in production Flask with uwsgi and might be related to the general issue of using Keras predict() in parallel mode:

When running it from the shell, it works:

$ FLASK_APP=cli.py FLASK_DEBUG=1 flask shell
>>> app.agent.handle_message(text_message='hello', sender_id=1)
[{'recipient_id': 1, 'text': 'hey there!'}]

So it must be about how Flask runs in production mode, likely about parallel (subprocess) running of multiple processes and trying to run Keras model.predict()

I believe this is the same issue as

"When using Keras with TensorFlow backend in Flask async environment, model.predict() is not working if the first call is in different thread from model loading.

This issue is not specific to deepcut, but for any async Keras+TF. Need a workaround until Keras handling this by itself. (Keras+Theono doesn’t has this issue)."

As such the solution could be to have the model loaded in the same thread as where the model’s predict() method is called.

It is fixed now!

In Flask I had to move my Rasa Agent setup code (which is using Keras with Tensorflow backend to predict) to the @app.before_first_request and then store the agent in the app variable (current_app) to fix this issue for my project.

This really is a Keras bug (or feature), lot of people have the same issue, see https://github.com/keras-team/keras/issues/2397

A footnote in the Rasa docs might be helpful for future users facing the same.

1 Like

Thanks a lot, you save my time