Memory Error

MemoryError: Unable to allocate array with shape (1000, 1369, 1369) and data type float32 during train large nlu data.

Nlu data size = 5.5mb

Machine size =

                 total        used        free      shared  buff/cache   available
           Mem:   31G         10G         19G         16M        1.3G         20G
           Swap:  0B          0B          0B

        when I start training Nlu data Memory size decreases as
          total - 31G       used - 10G        free - 19G      shared - 16M  buff/cache - 1.3G   available - 20G
                    
          total - 31G       used - 19G        free - 10G      shared - 16M  buff/cache - 1.3G   available - 20G

          total - 31G       used - 26G        free - 3.4G      shared - 16M  buff/cache - 1.3G   available - 20G

          total - 31G       used - 28G        free - 840M      shared - 16M  buff/cache - 1.3G   available - 20G

can you tell me why memory decreases in this ratio.

NLU version = 0.13.7

Pipeline used -

  • name: tokenizer_whitespace
  • name: intent_entity_featurizer_regex
  • name: ner_crf
  • name: intent_featurizer_count_vectors analyzer: ‘word’ min_ngram: 1 # int max_ngram: 2 # int
  • name: intent_classifier_tensorflow_embedding epochs: 100 language: en

According to this https://medium.com/rasa-blog/supervised-word-vectors-from-scratch-in-rasa-nlu-6daf794efcd8* 6000 labeled utterances took up just 120 Mb.

@Akanksha1 Can you please update to latest Rasa version and see if the problem still persists?

@dakshvar22 Train on new version of rasa (rasa version - 1.3.6) But no improvement in training.

@Akanksha1 In your configuration for EmbeddingIntentClassifier , can you add batch_strategy: sequence and report if the result stays the same?