Config for Spanish Bot

ZacanMeten · January 31, 2022, 4:47pm

I am creating my bot for a project, but I am having problems with the training result. Is this the best configuration I can use for Spanish?

pipeline:
  - name: "SpacyNLP"
    model: "es_core_news_sm"
    case_sensitive: False
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 200

policies:
  - name: RulePolicy
  - name: AugmentedMemoizationPolicy
    max_history: 6
  - name: TEDPolicy
    max_history: 10
    epochs: 20
    batch_size:
      - 32
      - 64
    constrain_similarities: true

rctatman · February 2, 2022, 8:05pm

Hmm, have you tried the default config for your pipeline? As written it should work for most white-space separated languages (including Spanish), especally if you have a fair amount of training data.

If you’re not working in a news domain (or with other fairly formal written text) and aren’t getting the results you want, you might consider also investing in training a custom SpaCy model.

(Sorry for the kinda vague answer, but “what’s the best pipeline for a specific chatbot” will probably require a bit of guessing & testing.)

Topic		Replies	Views
How to configure the pipeline using other language? Rasa Open Source	1	1733	September 30, 2019
Spanish chatbot Rasa Open Source	3	1008	March 23, 2021
spaCy pretrained models break chatbot NLU capacities Rasa Open Source	6	770	October 16, 2019
Issue - NLU with spanish Getting Started with Rasa	3	144	June 3, 2019
Rasa Knowledge Base bot with spacy pipeline Rasa Open Source	2	512	March 18, 2021

Config for Spanish Bot

Related topics