Why is there local path being added to the generated intent_featurizer_count_vectors.pkl

(Libindavis) #1

Hi Communities,

I am using tensorflow and CounVector to featurize the input text.
Here is my pipeline configure:

language: "en"
pipeline:
    - name: "nlp_spacy"
    - name: "tokenizer_spacy"
    - name: "intent_featurizer_count_vectors"
      stop_words: ['how','what','hows','is','the','whats']
      min_df: 0.0
      max_df: 1.0
      min_ngram: 1
      max_ngram: 2
    - name: "intent_entity_featurizer_regex"
    - name: "ner_crf"
      BILOU_flag: true
      features: [["low",'title'],
           ["bias", "low", "title","pos",'pattern','prefix5','prefix2', 'suffix5', 'suffix3'],
           ["low", "title"]]
    - name: "ner_duckling_http"
      url: "http://tegra-infra-nlp-vm-dev-01:8000"
      dimensions: ["time", "number", "duration", "ordinal"]
      locale: "en_US"
      timezone: "US/Pacific"
    - name: "ner_synonyms"
    - name: "intent_classifier_tensorflow_embedding"

But when I run my model in a Docker container, it would try to access my local file of /home/<my_local_path>/rasa_nlu/rasa_nlu/featurizers/count_vectors_featurizer.py, other than the related file in the docker which is /app/rasa_nlu_chatbot/rasa_nlu/rasa_nlu/featurizers/count_vectors_featurizer.py. And it leads to some strange error message like:

File "/usr/local/lib/python3.5/site-packages/sklearn/feature_extraction/text.py", line 266, in <lambda>
    tokenize(preprocess(self.decode(doc))), stop_words)
  File "/home/<my_local_path>/rasa_nlu/rasa_nlu/featurizers/count_vectors_featurizer.py", line 140, in _tokenizer

NameError: name 'T' is not defined

After some debugging, I found that in the generated intent_featurizer_count_vectors.pkl, there is a string of my local path /home/<my_local_path>/rasa_nlu/rasa_nlu/featurizers/count_vectors_featurizer.py.

My question is, why is the model in the docker trying to acces my local path ? Why is a local path in the generated model?

My RASA_nlu version is : “0.14.0a1”

0 Likes

(Libindavis) #2

Below is a snippet of the generated intent_featurizer_count_vectors.pkl, as you can see that my local path is being refered.

MethodType~T~E~TR~Th,~L^N_fill_function~T~S~T(h,~L^O_make_skel_func~T~S~Th.~L^HCodeType~T~E~TR~T(K^BK^@K^DK^DK^CCtt^@j^Ad^Ad^B|^A~C^C}^At^@j^B~H^@j^C~C^A}^B|^Bj^D|^A~C^A}

^C~H^@j^Erpt^F~H^@j^Gd^C~C^BrX~H^@j^E~H^@j^Gj^Hk^Frp~G^@f^Ad^Dd^E~D^H|^CD^@~C^A}^Cn^X~H^@j rp~G^@f^Ad^Fd^E~D^H|^CD^@~C^A}^C|^CS^@~T(~L%Override tokenizer in CountVecto rizer~T~L \b[0-9]+\b~T~L

__NUMBER__~T~L^Kvocabulary_~Th8(K^AK^@K^BK^DK^SC&g^@|^@]^^}^A|^A~H^@j^@j^Aj^B~C^@k^Fr^|^An^D~H^@j^C~Q^Bq^DS^@~T (h^^h=~L^Dkeys~Th^Xt~T~L^B.0~T~L^At~T~F~T~L~Y /home/<my_local_path>/rasa_nlu/rasa_nlu/featurizers/count_vectors_featurizer.py ~T~L

~TK~VC^B^F^A~T~L^Dself~T~E~T)t~TR~T~L5CountVectorsFeaturizer._tokenizer..

~Th8(K^AK^@K^BK^DK^SC g^@|^@]^X}^A|^A~H^@j^@k^Fr^X~H^@j^An^B|^A~Q^Bq^ DS^@~T)h^Yh^X~F~ThAhB~F~ThDhEK~\C^B^F^A~ThG~E~T)t~TR~Tt~T(~L^Bre~T~L^Csub~T~L^Gcompile~Th^G~L^Gfindall~Th^X~L^Ghasattr~Th^^h=h^Yt~T(hG~L^Dtext~Th^G~L^Ftokens~Tt~ThD~L _tokenizer~TK~JC^X^@^B^N^B^L^A

0 Likes