When train my own model, I got the Segmentation fault, looks like below. To make the question more clear, I restart a new fresh EC2 on AWS with new install of rasa. The strange thing is that, after install, it’s OK to run “rasa init --no-prompt” to generate the demo file in folder “t-rasa” and get train results successfully. However, when “rasa train --force -vv”, it shows Segmentation fault (core dumped)
(venv-rasa) ubuntu@ip-172-31-15-18:~/t-rasa$ rasa train --force -vv
2020-02-12 14:21:23 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/nlu.md' is 'md'.
2020-02-12 14:21:23 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/stories.md' is 'unk'.
2020-02-12 14:21:23 DEBUG pykwalify.compat - Using yaml library: /home/ubuntu/venv-rasa/lib/python3.6/site-packages/ruamel/yaml/__init__.py
2020-02-12 14:21:23 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/nlu.md' is 'md'.
2020-02-12 14:21:23 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/nlu.md' is 'md'.
Training Core model...
2020-02-12 14:21:26 INFO absl - Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]
2020-02-12 14:21:26 DEBUG rasa.core.nlg.generator - Instantiated NLG to 'TemplatedNaturalLanguageGenerator'.
2020-02-12 14:21:26 DEBUG rasa.core.training.generator - Generated trackers will be deduplicated based on their unique last 5 states.
2020-02-12 14:21:26 DEBUG rasa.core.training.generator - Number of augmentation rounds is 3
2020-02-12 14:21:26 DEBUG rasa.core.training.generator - Starting data generation round 0 ... (with 1 trackers)
Processed Story Blocks: 0%| | 0/5 [00:00<?, ?it/s, # trackers=1]Segmentation fault (core dumped)
If run “rasa shell”, it fails as below:
(venv-rasa) ubuntu@ip-172-31-15-18:~/t-rasa$ rasa shell
2020-02-12 14:21:54 INFO root - Connecting to channel 'cmdline' which was specified by the '--connector' argument. Any other channels will be ignored. To connect to all given channels, omit the '--connector' argument.
2020-02-12 14:21:54 INFO root - Starting Rasa server on http://localhost:5005
2020-02-12 14:21:56 INFO absl - Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]
Bot loaded. Type a message and press enter (use '/stop' to exit):
Your input -> Segmentation fault (core dumped)
CPU, memory and disk is enough to run. If I ran “rasa train nlu -vv”, here is the print-out. Same issue.
(venv-rasa) ubuntu@ip-172-31-15-18:~/t-rasa$ rasa train nlu -vv
2020-02-12 14:40:00 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/nlu.md' is 'md'.
2020-02-12 14:40:00 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/stories.md' is 'unk'.
2020-02-12 14:40:00 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/nlu.md' is 'md'.
Training NLU model...
2020-02-12 14:40:03 INFO absl - Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]
2020-02-12 14:40:03 DEBUG rasa.nlu.training_data.loading - Training data format of 'data/nlu.md' is 'md'.
2020-02-12 14:40:03 INFO rasa.nlu.training_data.training_data - Training data stats:
- intent examples: 43 (7 distinct intents)
- Found intents: 'greet', 'affirm', 'goodbye', 'mood_great', 'deny', 'mood_unhappy', 'bot_challenge'
- Number of response examples: 0 (0 distinct response)
- entity examples: 0 (0 distinct entities)
- found entities:
2020-02-12 14:40:03 DEBUG rasa.nlu.training_data.training_data - Validating training data...
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component WhitespaceTokenizer
2020-02-12 14:40:03 INFO rasa.nlu.model - Finished training component.
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2020-02-12 14:40:03 INFO rasa.nlu.model - Finished training component.
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component CRFEntityExtractor
2020-02-12 14:40:03 INFO rasa.nlu.model - Finished training component.
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2020-02-12 14:40:03 INFO rasa.nlu.model - Finished training component.
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer
2020-02-12 14:40:03 DEBUG rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - No text provided for response attribute in any messages of training data. Skipping training a CountVectorizer for it.
2020-02-12 14:40:03 INFO rasa.nlu.model - Finished training component.
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer
2020-02-12 14:40:03 DEBUG rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - No text provided for response attribute in any messages of training data. Skipping training a CountVectorizer for it.
2020-02-12 14:40:03 INFO rasa.nlu.model - Finished training component.
2020-02-12 14:40:03 INFO rasa.nlu.model - Starting to train component EmbeddingIntentClassifier
2020-02-12 14:40:03 DEBUG rasa.nlu.classifiers.embedding_intent_classifier - Started training embedding classifier.
Epochs: 100%|██████████| 300/300 [00:03<00:00, 94.43it/s, loss=1.092, acc=1.000]Segmentation fault (core dumped)
Hi @rasafan, I’m not using dockerfile, instead, I’m using EC2 instance directly. The commands listed up are all I have to solve the issue. Hope it suits your case.