NLU training failing

I am using rasa version = 1.3.6 Data size is around 5.6 MB. Number of intents = 1412. Using tensorflow-gpu version 1.14.0. Also used tensoflow=1.14.0 for experiment.

Config.yml is as follows:-

pipeline:
- name: WhitespaceTokenizer
- name: CRFEntityExtractor
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 4
  max_ngram: 6
- name: EmbeddingIntentClassifier
  epochs: 150

policies:
- max_history: 1
  name: MemoizationPolicy
- core_threshold: 0.3
  name: FallbackPolicy
  nlu_threshold: 0.8
- name: FormPolicy
- name: MappingPolicy

When I run training on this configuration, I am getting following exception -

 MemoryError: Unable to allocate array with shape (168287, 38324) and data type int64

 [[{{node PyFunc}}]]
 Hint: If you want to see a list of allocated tensors when OOM happens, add 
 report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[IteratorGetNext]]
 Hint: If you want to see a list of allocated tensors when OOM happens, add 
 report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Shape/_7]]
Hint: If you want to see a list of allocated tensors when OOM happens, add 
report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: MemoryError: Unable to allocate array with shape (168287, 38324) and 
data type int64
Traceback (most recent call last):

File "/home/ubuntu/rasa_env/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 
209, in __call__
ret = func(*args)

File "/home/ubuntu/rasa_env/lib/python3.6/site- 
packages/tensorflow/python/data/ops/dataset_ops.py", line 514, in generator_py_func
values = next(generator_state.get_iterator(iterator_id))

File "/home/ubuntu/rasa_env/lib/python3.6/site-packages/rasa/utils/train_utils.py", line 201, in 
gen_batch
session_data = balance_session_data(session_data, batch_size, shuffle)

File "/home/ubuntu/rasa_env/lib/python3.6/site-packages/rasa/utils/train_utils.py", line 184, in 
balance_session_data
Y=np.concatenate(new_Y),

File "<__array_function__ internals>", line 6, in concatenate

MemoryError: Unable to allocate array with shape (168287, 38324) and data type int64


 [[{{node PyFunc}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add 
report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[IteratorGetNext]]
Hint: If you want to see a list of allocated tensors when OOM happens, add 
report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

I tried with different analyzer -

 - name: CountVectorsFeaturizer
  analyzer: word

In this case training happened successfully.

 Training time was around 2.5 hours on tensorflow and 
 Training time on tensorflow-gpu was 1.25 hours.
 Numebr of GPU =1 in this case.

I tried with more number of GPU but still issue happening with char_wb analyzer as tensorflow using only device:0 , no other GPU is getting utillized

   Number of GPU = 4. 
   Memory size per GPU = 12 GB

I even tried with

   batch_strategy: sequence
   max_features = 10000

Data size is not that big but still facing issues. Does anybody else also faced the same issue? How to force tensorflow to use all available GPU devices?

char n-gram cv featurizer blows up memory and hence OOM. We are working on sparse implementation for features. You can try the new version on updated-featurizers branch.

@Ghostvv I checked with this branch. Training time is significantly reduced. It took 20 mins on this data. When is the update with this feature is coming?

I’m glad to hear that it is working for you. We’re currently working on finalizing it. Unfortunately, I cannot tell you the exact date when we release it.

@Ghostvv I trained model on 1000 epochs, loss was > 5 and accuracy = 0.998, training took 3.5 hours. How do I reduce Loss ?

if accuracy is that high, you don’t need to reduce the loss more