I am using rasa version = 1.3.6 Data size is around 5.6 MB. Number of intents = 1412. Using tensorflow-gpu version 1.14.0. Also used tensoflow=1.14.0 for experiment.
Config.yml is as follows:-
pipeline:
- name: WhitespaceTokenizer
- name: CRFEntityExtractor
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 4
max_ngram: 6
- name: EmbeddingIntentClassifier
epochs: 150
policies:
- max_history: 1
name: MemoizationPolicy
- core_threshold: 0.3
name: FallbackPolicy
nlu_threshold: 0.8
- name: FormPolicy
- name: MappingPolicy
When I run training on this configuration, I am getting following exception -
MemoryError: Unable to allocate array with shape (168287, 38324) and data type int64
[[{{node PyFunc}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[IteratorGetNext]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[Shape/_7]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: MemoryError: Unable to allocate array with shape (168287, 38324) and
data type int64
Traceback (most recent call last):
File "/home/ubuntu/rasa_env/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line
209, in __call__
ret = func(*args)
File "/home/ubuntu/rasa_env/lib/python3.6/site-
packages/tensorflow/python/data/ops/dataset_ops.py", line 514, in generator_py_func
values = next(generator_state.get_iterator(iterator_id))
File "/home/ubuntu/rasa_env/lib/python3.6/site-packages/rasa/utils/train_utils.py", line 201, in
gen_batch
session_data = balance_session_data(session_data, batch_size, shuffle)
File "/home/ubuntu/rasa_env/lib/python3.6/site-packages/rasa/utils/train_utils.py", line 184, in
balance_session_data
Y=np.concatenate(new_Y),
File "<__array_function__ internals>", line 6, in concatenate
MemoryError: Unable to allocate array with shape (168287, 38324) and data type int64
[[{{node PyFunc}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[IteratorGetNext]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored.
I tried with different analyzer -
- name: CountVectorsFeaturizer
analyzer: word
In this case training happened successfully.
Training time was around 2.5 hours on tensorflow and
Training time on tensorflow-gpu was 1.25 hours.
Numebr of GPU =1 in this case.
I tried with more number of GPU but still issue happening with char_wb analyzer as tensorflow using only device:0 , no other GPU is getting utillized
Number of GPU = 4.
Memory size per GPU = 12 GB
I even tried with
batch_strategy: sequence
max_features = 10000
Data size is not that big but still facing issues. Does anybody else also faced the same issue? How to force tensorflow to use all available GPU devices?