Model.predict hangs for Keras model in Flask with uwsgi (repost)

Zoltan · August 10, 2018, 9:44pm

REPOSTING AFTER EARLIER POSTING GOT MARKED AS SPAMMED AND REMOVED BY MODERATOR BY MISTAKE AND MAKING UPDATES WITH NEW FINDINGS

I have a full Flask app containerized in Docker which is calling an external NLU (also containerized). Everything works fine in dev mode (dev mode not containerized, only in Python virtual env), but in production the current_app.agent.handle(msg) hangs.

It turns out that the Keras model’s predict() method hangs, but only when running it in production with Flask and uwsgi. Similar to what is described here: python - keras prediction gets stuck when deployed using uwsgi in a flask app - Stack Overflow

It hangs at this line: https://github.com/RasaHQ/rasa_core/blob/master/rasa_core/policies/keras_policy.py#L199

I have 3 policies (MemoizationPolicy(), KerasPolicy(), fallback) and it only hangs on the Keras one.

It seems this is an issue with running Keras predict() with Tensorflow backend in production Flask with uwsgi and might be related to the general issue of using Keras predict() in parallel mode:

github.com/keras-team/keras

Hard locks (hangs) trying to use keras model in parallel with joblib

opened 04:31PM - 08 Jul 16 UTC

closed 11:10AM - 19 Jul 17 UTC

paulomalvar

I'm experiencing hard locks when trying to predict labels in parallel using jobl…ib. I tried using multiprocessing directly instead of joblib and the same thing happens. The function that runs in parallel and that calls keras model (trained using tensorflow's backend) just gets locked, no prediction is made and the processed gets hung forever. This happens both on Mac and Linux. The example in the gist I'm referencing below can't be run without the trained model, but it illustrates the kind of problem I'm talking about. Following the example should be enough to reproduce this issue. https://gist.github.com/paulomalvar/4457018d4833dd9fd452f46788ef55a1

When running it from the shell, it works:

$ FLASK_APP=cli.py FLASK_DEBUG=1 flask shell
>>> app.agent.handle_message(text_message='hello', sender_id=1)
[{'recipient_id': 1, 'text': 'hey there!'}]

So it must be about how Flask runs in production mode, likely about parallel (subprocess) running of multiple processes and trying to run Keras model.predict()

Zoltan · August 11, 2018, 1:44am

I believe this is the same issue as

github.com/rkcosmos/deepcut

Fix thread-related issue in Keras + TensorFlow + Flask async env

rkcosmos:master ← bact:master

opened 01:15AM - 02 Feb 18 UTC

bact

+82 -65

Fix thread-related issue in Keras + TensorFlow + Flask async environment. Whe…n using Keras with TensorFlow backend in Flask async environment, model.predict() is not working if the first call is in different thread from model loading. This issue is not specific to deepcut, but for any async Keras+TF. Need a workaround until Keras handling this by itself. (Keras+Theono doesn't has this issue). Important changes are at lines: 23, 43-46: new tokenize() function (old tokenize() moved to DeepcutTokenize as a method) 131-133: saving of model and graph for async reference 305-311: making reference of saved model and graph (Can ignore lines 3-12 and 15-18, it just an import sort as suggested by pylint. Other changes are also pylinted.) This is just an proposal, there can be other way to fix as well. Such as using global for model and graph. Please see: https://github.com/keras-team/keras/issues/2397

"When using Keras with TensorFlow backend in Flask async environment, model.predict() is not working if the first call is in different thread from model loading.

This issue is not specific to deepcut, but for any async Keras+TF. Need a workaround until Keras handling this by itself. (Keras+Theono doesn’t has this issue)."

As such the solution could be to have the model loaded in the same thread as where the model’s predict() method is called.

Zoltan · August 13, 2018, 8:29pm

It is fixed now!

In Flask I had to move my Rasa Agent setup code (which is using Keras with Tensorflow backend to predict) to the @app.before_first_request and then store the agent in the app variable (current_app) to fix this issue for my project.

This really is a Keras bug (or feature), lot of people have the same issue, see https://github.com/keras-team/keras/issues/2397

A footnote in the Rasa docs might be helpful for future users facing the same.

alexvaa · October 20, 2020, 10:57am

Thanks a lot, you save my time

Topic		Replies	Views
Current_app.agent.handle(msg) hangs in Flask but only in production mode Rasa Open Source	2	503	August 13, 2018
Prediction is keep on loading while using tensorflow model Rasa Open Source	7	703	September 24, 2019
Posts are being flagged as spam and removed (hidden) Rasa Open Source	4	395	August 10, 2018
Unable to chat with model - rasa chat function hangs in jupyter notebook Rasa Open Source	5	505	July 3, 2021
[SOLVED]RASA NLU misbehaving when trained on server but works when trained on local Rasa Open Source	12	1171	May 22, 2019

Model.predict hangs for Keras model in Flask with uwsgi (repost)

Related topics