Unable to download huggingface model

dma76 · September 13, 2022, 2:29pm

Hello,

I use Rasa v2.0.2

I have an error when i run rasa train nlu. Rasa can’t download huggingface model. I have this error : OSError: Can’t load tokenizer for ‘camembert-base’. If you were trying to load it from ‘Models - Hugging Face’, make sure you don’t have a local directory with thesame name. Otherwise, make sure ‘camembert-base’ is the correct path to a directory containing all relevant files for a BertTokenizer tokenizer. 2022-09-13 16:08:09 WARNING urllib3.connectionpool - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ProxyError(‘Cannot connect to proxy.’, OSError(‘Tunnel connection failed: 407 Proxy Authentication Required’,))’: /api/2801673/store/

However, when I run the transformers command directly, I have no problem downloading the model.

from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained(“camembert-base”) Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 26.4kB/s] Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 570/570 [00:00<00:00, 437kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 226k/226k [00:00<00:00, 664kB/s] Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 455k/455k [00:00<00:00, 1.09MB/s]

I have not found in the docs any specific proxy settings for Rasa. Has anyone experienced this problem?

my conf :

language: fr

pipeline:

name: HFTransformersNLP
model_name: “bert”
model_weights: “camembert-base”
cache_dir: /xxx/yyyy/.cache # required with Botfront
name: LanguageModelTokenizer
name: LanguageModelFeaturizer
name: CountVectorsFeaturizer
name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
name: DIETClassifier

parisapouya · March 7, 2023, 1:17am

Hi! Where you able to solve this problem?

lekamm · March 9, 2023, 1:04pm

Hello,

I manage to make this work also I had the same problem originally. The config I am using now is:

pipeline:
- name: SpacyNLP
  model: "fr_core_news_lg" #python -m spacy download fr_core_news_lg 
- name: SpacyTokenizer
- name: CountVectorsFeaturizer
  analyzer: word
  OOV_token: oov
  strip_accents: ascii
- name: LanguageModelFeaturizer
  #model_name: "bert"
  #model_weights: "rasa/LaBSE"
  #cache_dir: "./LaBSE"
  model_name: "camembert"
  model_weights: "camembert-base"
  cache_dir: "./camembert"
- name: DIETClassifier
  intent_tokenization_flag: true
  intent_split_symbol: +
  epochs: 100
  constrain_similarities: true
- name: EntitySynonymMapper

As you can see, I have tested the “rasa/LaBSE” and after came back to camembert. Then it worked.

Let me know. Cheers. Camille

Topic		Replies	Views
Model Loading Error and Cache Directory Problems - Issues Upgrading Rasa from 3.4.0 to 3.6.2 Rasa Open Source	0	51	November 18, 2024
How to import huggingface models to Rasa? Rasa Open Source	12	4873	December 27, 2021
Avoid loading model from huggingface every time Rasa Open Source	0	875	October 5, 2022
Can't load bert German model from huggingface Rasa Open Source	2	2450	June 13, 2022
RASA and camemBERT Rasa Open Source	14	3456	June 8, 2021

Unable to download huggingface model

Related topics