Error in loading fastText word embedding in rasa_nlu_example

Hi RASAian! I’ve tried to train my Word2Vec model with gensim and fastText library to use in rasa_nlu_example project pipeline. my goal is to test supervised word embedding model for improve NER. the load_model function in fasttext library could not load my .bin model. i’ve tested my model on python code and tried to load model and there isn’t any problem.

@koaning help me again!plz

my model.bin file for download: https://gofile.io/d/MPvMbX snipe of python code:

from gensim.models import Word2Vec
embedding_model = Word2Vec.load('model.bin')

rasa version:1.10.8 python: 3 extra project installation: pip install git+https://github.com/RasaHQ/rasa-nlu-examples

pipeline config:

language: fa
pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 5
  - name: rasa_nlu_examples.featurizers.dense.FastTextFeaturizer
    cache_dir: fasttext/
    file: model.bin
  - name: DIETClassifier
    epochs: 1
  - name: EntitySynonymMapper

rasa train nlu log:

Training NLU model...
Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar.
Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/__main__.py", line 92, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/cli/train.py", line 140, in train_nlu
    persist_nlu_training_data=args.persist_nlu_data,
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/train.py", line 414, in train_nlu
    persist_nlu_training_data,
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/train.py", line 453, in _train_nlu_async
    persist_nlu_training_data=persist_nlu_training_data,
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/train.py", line 482, in _train_nlu_with_validated_data
    persist_nlu_training_data=persist_nlu_training_data,
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/nlu/train.py", line 75, in train
    trainer = Trainer(nlu_config, component_builder)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/nlu/model.py", line 145, in __init__
    self.pipeline = self._build_pipeline(cfg, component_builder)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/nlu/model.py", line 157, in _build_pipeline
    component = component_builder.create_component(component_cfg, cfg)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/nlu/components.py", line 781, in create_component
    component = registry.create_component_by_config(component_config, cfg)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/nlu/registry.py", line 246, in create_component_by_config
    return component_class.create(component_config, config)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa/nlu/components.py", line 489, in create
    return cls(component_config)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/rasa_nlu_examples/featurizers/dense/fasttext_featurizer.py", line 51, in __init__
    self.model = fasttext.load_model(path)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/fasttext/FastText.py", line 441, in load_model
    return _FastText(model_path=path)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/fasttext/FastText.py", line 98, in __init__
    self.f.loadModel(model_path)
ValueError: fasttext/model.bin has wrong file format!

I think gensim has a slightly different format than fasttext. Could you add an issue on github for this one? I’m the maintainer of that project but I’m also currently on holiday. If you add this issue to github I’ll adress it when I get back.