Feedback: Upgrading to Tensorflow 2.6

Rasa will upgrade from Tensorflow 2.3 to Tensorflow 2.6 as of Rasa Open Source version 2.8.9. The upgrade is required due to security concerns but it may have consequences on training times in the short term.

You can read more details about the upgrade on our blog and we’d like to use this thread to collect metrics/feedback from the community. @fkoerner and I will keep an eye on this thread so feel free to AskUsAnything[tm].

3 Likes

I’ve just upgraded and I’m experiencing a huge increase in training time.

Current setup:

Rasa Version      :         2.8.9
Minimum Compatible Version: 2.8.9
Rasa SDK Version  :         2.8.2
Rasa X Version    :         0.42.3
Python Version    :         3.7.11
Operating System  :         Linux-5.13.19-2-MANJARO-x86_64-with-arch-Manjaro-Linux
Python Path       :         /home/joan/Desktop/rasa-upgrade/VihrtualApp/venv/bin/python3.7

Project repo: https://github.com/joancipria/VihrtualApp

1 Like

Could you explain what you mean by “huge”? Got some before/after metrics?

Training time before upgrading was about 35 minutes, now is about 3 hours. I moved from Rasa 2.5.2 to 2.8.9.

1 Like

Thank you for sharing! These workarounds may be an option for you:

  1. turn entity_recognition for DIETClassifier off, and use CRFEntityExtractor instead
  2. downgrade to Rasa version 2.8.8

Thanks for the advice! I’ve followed your instructions and now training times are back to normal. Will using CRFEntityExtractor instead of DIETClassifier for entity recognition produce a bad impact on the model?

It won’t be exactly the same behaviour – I’d recommend that you try it out with a test set to see if the entities are still extracted as expected. If the performance has degraded, you can also counter-act this by adding more examples for those entities.

Looking at your domain I believe you only have one entity type that was extracted by DIET (I believe the other is extracted by RegexEntityExtractor), so I am guessing you will not have problems using the CRFEntityExtractor.

Hi there!

With the configuration:

Rasa Version      :         2.8.11
Minimum Compatible Version: 2.8.9
Rasa SDK Version  :         2.8.2
Rasa X Version    :         0.42.4
Python Version    :         3.8.10
Operating System  :         Linux-4.15.0-1026-gcp-x86_64-with-glibc2.29
Python Path       :         /opt/venv/bin/python

When training the DIET classifier, this appears:

/opt/venv/lib/python3.8/site-packages/rasa/utils/tensorflow/model_data.py:750: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  np.concatenate(np.array(f)),

Is it “normal”?

Thanx

Pedro Lopes

If it’s a mere warning I wouldn’t be too concerned.

The problem is that this happened at 40%/epochs:

 File "/opt/venv/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/opt/venv/lib/python3.8/site-packages/rasa/__main__.py", line 118, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/train.py", line 59, in <lambda>
    train_parser.set_defaults(func=lambda args: run_training(args, can_exit=True))
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/train.py", line 91, in run_training
    training_result = train_all(
  File "/opt/venv/lib/python3.8/site-packages/rasa/api.py", line 109, in train
    return rasa.utils.common.run_in_loop(
  File "/opt/venv/lib/python3.8/site-packages/rasa/utils/common.py", line 296, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 108, in train_async
    return await _train_async_internal(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 288, in _train_async_internal
    await _do_training(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 334, in _do_training
    model_path = await _train_nlu_with_validated_data(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 758, in _train_nlu_with_validated_data
    await rasa.nlu.train.train(
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/train.py", line 111, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/model.py", line 221, in train
    component.train(working_data, self.config, **context)
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/classifiers/diet_classifier.py", line 880, in train
    self.model.fit(
  File "/opt/venv/lib/python3.8/site-packages/rasa/utils/tensorflow/temp_keras_modules.py", line 190, in fit
    tmp_logs = train_function(iterator)
  File "/opt/venv/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/opt/venv/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 917, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/opt/venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3039, in __call__
    return graph_function._call_flat(
  File "/opt/venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1963, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/opt/venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 591, in call
    outputs = execute.execute(
  File "/opt/venv/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.ResourceExhaustedError:  OOM when allocating tensor with shape[160,765,765] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
         [[node zeros_like_40 (defined at /lib/python3.8/site-packages/rasa/utils/tensorflow/models.py:158) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
 [Op:__inference_train_function_87688]

Errors may have originated from an input operation.
Input Source operations connected to node zeros_like_40:
 cond_1/PartitionedCall (defined at /lib/python3.8/site-packages/tensorflow_addons/text/crf.py:202)

Function call stack:
train_function

And it stopped…

:sob:

Hi @nonola this warning is indeed “normal” (we are aware of it, and plan to address it in the future). I also don’t think it is related to your training stopping, that seems to be an OOM (out of memory) issue. If you were able to train this model on rasa<2.8.9 and are now running into OOM it’s possible the TF upgrade is the culprit. You have three options:

  1. train on a machine with more memory
  2. reduce the memory requirements of training your model (we can discuss different strategies for this, depending on what your config and training data look like)
  3. downgrade your rasa version back to <2.8.9
1 Like

Hi @fkoerner,

I’m running RASA X on a google cloud VM with n2-standard-16 (16 vCPUs, 64 GB memory). Shouldn’t it be enough? Some days ago, I did downgraded to 2.8.8 and it works.

Here is my config file:

language: pt
pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  epochs: 20
  learning_rate: 0.005
  constrain_similarities: true
- name: EntitySynonymMapper
- name: ResponseSelector
  epochs: 100
  constrain_similarities: true
- name: ResponseSelector
  epochs: 100
  retrieval_intent: chitchat
- name: FallbackClassifier
  threshold: 0.5
  ambiguity_threshold: 0.1
policies:
- name: MemoizationPolicy
- name: RulePolicy
- name: UnexpecTEDIntentPolicy
  max_history: 5
  epochs: 100
- name: TEDPolicy
  max_history: 5
  epochs: 100
  constrain_similarities: true

Any improvement to do?

Thanks!

Hi @nonola, actually, this config doesn’t look particularly heavy. I thought maybe you would have memory heavy featurizers like LanguageModelFeaturizer.

One thing you can try is to use CRFEntityExtractor to extract entities and set entity_extraction=False for DIET. I must say I am surprised you have so few epochs for DIET… are you extracting entities successfully with this little training?

You could also try and see if you can get a single ResponseSelector to fulfil the purpose of two.

Hi @fkoerner!

Thanks for your feedback. I tried to use CRFEntityExtractor, but as I use portuguese language, some entities name have non-ascii char. So I get this error:

Traceback (most recent call last):
  File "/opt/venv/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/opt/venv/lib/python3.8/site-packages/rasa/__main__.py", line 117, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/train.py", line 196, in run_nlu_training
    return train_nlu(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 646, in train_nlu
    return rasa.utils.common.run_in_loop(
  File "/opt/venv/lib/python3.8/site-packages/rasa/utils/common.py", line 296, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 696, in train_nlu_async
    return await _train_nlu_with_validated_data(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 758, in _train_nlu_with_validated_data
    await rasa.nlu.train.train(
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/train.py", line 111, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/model.py", line 221, in train
    component.train(working_data, self.config, **context)
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/extractors/crf_entity_extractor.py", line 189, in train
    self._train_model(dataset)
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/extractors/crf_entity_extractor.py", line 598, in _train_model
    entity_tagger.fit(X_train, y_train)
  File "/opt/venv/lib/python3.8/site-packages/sklearn_crfsuite/estimator.py", line 314, in fit
    trainer.append(xseq, yseq)
  File "pycrfsuite/_pycrfsuite.pyx", line 312, in pycrfsuite._pycrfsuite.BaseTrainer.append
  File "stringsource", line 48, in vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string
  File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string
UnicodeEncodeError: 'ascii' codec can't encode characters in position 9-10: ordinal not in range(128)

Any suggestion?

Thanks

@nonola does you have maybe three examples from a nlu.yml file that I can copy to reproduce the error?

Hi Vicente, here it goes:

- intent: prazos_pagar
  examples: |
    - até quando é que posso pagar o [imposto de selo]{"entity": "imposto", "value": "SELO"} pela participação de [óbito](óbito)?
    - qual o prazo para o pagamento de uma [divida](imposto)?
    - Quando é que [pago](pagar) o [IUC](imposto)?
    - qual o prazo para pagar o [iuc]{"entity": "imposto", "value": "IUC"}?
    - [IUC](imposto) prazo pagamento
    - qual o prazo de pagamento [iuc]{"entity": "imposto", "value": "IUC"}
    - quando tenho de pagar o [IUC](imposto)
    - quando termina o prazo para pagar o [IUC](imposto)
    - até quando posso pagar o [IUC](imposto)
    - até quando é que pago o [imposto do meu carro]{"entity": "imposto", "value": "IUC"}?
    - [iuc]{"entity": "imposto", "value": "IUC"} prazo de pagamento
    - até quando é que tenho de pagar o [imposto do carro]{"entity": "imposto", "value": "IUC"}?
    - como é que sei o prazo de pagamento do [iuc]{"entity": "imposto", "value": "IUC"}?
    - quando é pago o imposto apurado na [dmis]{"entity": "declarações", "value": "DMIS"}?
    - quando é que pago o valor da [dmis]{"entity": "declarações", "value": "DMIS"}?
    - quando recebo documento para o pagamento do [AIMI](imposto)?

Another one:

- intent: avaliação
  examples: |
    - qual o prazo para [pedir 2ª avaliação]{"entity": "tipo_avaliação", "group": "segunda", "value": "pedir"} do [imóvel](tipo_imóvel)?
    - [Para](para) [que](que) serve a [avaliação](tipo_avaliação) de um [imóvel](tipo_imóvel)?
    - [segunda avaliação]{"entity": "tipo_avaliação", "group": "segunda"} [IMI](imposto)
    - Acho que pago muito de [IMI](imposto). como [simular nova avaliação]{"entity": "tipo_avaliação", "group": "segunda", "value": "simular"}?
    - Apresentei um [pedido de avaliação]{"entity": "tipo_avaliação", "group": "normal", "value": "pedir"} no [final do ano]{"entity": "ano", "value": "anterior"} mas ainda não foi feita. A partir de quando vai ser corrigido o [valor tributário](VPT)?
    - [avaliação](tipo_avaliação) [IMI](imposto)
    - [avaliação periódica]{"entity": "tipo_avaliação", "group": "periódica"} de imóveis
    - [avaliação compra]{"entity": "tipo_avaliação", "group": "compra"} [imóvel](tipo_imóvel)
    - [avaliação](tipo_avaliação) de [imóveis](tipo_imóvel)
    - [avaliação](tipo_avaliação) de uma [casa]{"entity": "tipo_imóvel", "group": "urbano", "value": "habitacional"}

Another one:

- intent: simulação
  examples: |
    - Posso simular o pedido de [pagamento em prestações](tipo_pagamento) de uma [divida](PEF) no Portal das Finanças?
    - Simular [IRS](imposto) em prestações
    - simular [pagamento a prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} [IRS](imposto)
    - simular [prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"}
    - Ja acedi a simulaçao 5 meses o valor é de 688.70 como posso finalizar o pedido
    - simular [prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} de [IRS](imposto)
    - como fazer simulação de prestações de [IRS](imposto)
    - onde posso obter simulação para [pagamento prestacional]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} de 39000€ em 36 meses
    - Gostaria de fazer simulação para [dividir em prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} meu [IRS](imposto)
    - quero simular [pagamento a prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} [IRS](imposto)
    - como faço para simular [pagamento a prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} [IRS](imposto)
    - necessito de ajuda para simular [pagamento a prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} [IRS](imposto)
    - Em quantas [prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"} posso pagar uma [divida fiscal](PEF)?
    - como conseguir uma simulação de [avaliação](tipo_avaliação) de imóvel
    - como conseguir uma simulação de [avaliação](tipo_avaliação) de uma casa
    - como conseguir uma simulação de [IMI](imposto)
    - como conseguir uma simulação de [IRS](imposto)
    - como conseguir uma simulação de um [plano de prestações]{"entity": "tipo_pagamento", "value": "pagamento em prestações"}
    - como consigo uma simulação de [avaliação](tipo_avaliação) de uma casa

As you can see, I’ve some entity names like “tipo_avaliação”, “tipo_imóvel” or “óbito” which contains non-ascii char.

Is there any way to solve this?

Thanks!

@nonola I may have some ideas but I think it may be useful to discuss that in a separate thread, I’ve taken the liberty of starting one here. Will copy your comments there now.

1 Like

What about setting a lower batch_size for DIETClassifier? Could this work as a workaround?

@joancipria sorry, do you mean for OOM issues? Or to reduce training times?