Rasa does not extract person names in cyrillic

MMustafa · April 22, 2020, 1:07pm

Hello!I want extract person name.For example bot:What is your name? user:Meruyert(in cyrillic).Problem in that rasa extracts just names which I wrote in intent examples but new names it does not classify as name

amn41 · April 24, 2020, 2:58pm

hi! sounds like your model has overfit to a few specific names. What config are you using? How much data do you have?

MMustafa · April 27, 2020, 4:56am

Hello!I am creating bot in russian language and I think i have pretty much data.I am using default config file just changed language from en to ru

amn41 · April 27, 2020, 5:23am

in principle there is no reason this should not work. how much data? the default config has changed in different versions, can you post yours here?

MMustafa · April 27, 2020, 5:35am

This is config file: language: ru pipeline:

name: WhitespaceTokenizer
name: RegexFeaturizer
name: LexicalSyntacticFeaturizer
name: CountVectorsFeaturizer
name: CountVectorsFeaturizer analyzer: “char_wb” min_ngram: 1 max_ngram: 4
name: DIETClassifier epochs: 100
name: EntitySynonymMapper
name: ResponseSelector epochs: 100

Configuration for Rasa Core.

Policies

policies:

name: MemoizationPolicy
name: TEDPolicy max_history: 5 epochs: 100
name: MappingPolicy
name: FallbackPolicy nlu_threshold: 0.3 core_threshold: 0.3 fallback_action_name: “action_fallback”

MMustafa · April 27, 2020, 5:37am

I don’t know how to measure data size but overall I have 122 intents

MMustafa · April 30, 2020, 6:41am

Also how can i handle this problem: when I write down in rasa shell nlu “hello” it shows me that it is greeting intent and probability equals 0.45.It is ok.But problem in that if I write “Send me the latest news in sports” nlu determines it as “what is your favourite sport” intent.So I put nlu threshhold in Fallback policy equals to 0.6 to get rid of last problem when it does not determine intents correctly,but it also inflects to my greeting intent.How to solve it.I want to increase nlu threshhold but it also reflects to correctly determined intents and after increasing threshhold bot does not determine “hello” as greeting

amn41 · May 1, 2020, 10:59am

hi @MMustafa ! I can recommend using the Rasa testing tools to pick the right cutoff (and perhaps add more data or tweak your configuration)

you can split your data into train and test sets, and use rasa test with the --histogram option, see Testing Your Assistant

that will show how the confidence values are distributed

MMustafa · May 4, 2020, 4:27am

Hello Alan!Thanks for response and advice.I used rasa test and wrote some test stories after I got this picture.Unfortunately I don’t understand what does this picture mean.Can you explain to what measure should I look to determine nlu/core threshold in Fallback policy

MMustafa · May 4, 2020, 6:17am

I have not solved problem with extracting person’s name.My form is Меруерт.I have 30 examples like this with different names.What to do next?Can increasing training examples solve this problem.If yes how much examples do I need, if not what else can I do to solve it?

amn41 · May 4, 2020, 9:24am

hi @MMustafa - the rasa test command should produce a histogram, this blog post might help.

for testing your entity performance, I would recommend creating a train test split and evaluating

Topic		Replies	Views
Extracting Names Rasa Open Source	12	1271	May 28, 2021
How to setup basic PERSON extraction in English and then include it in a utterance as a variable Getting Started with Rasa	9	200	July 28, 2020
[Solved] RASA FormPolicy Rasa Open Source	15	3356	June 16, 2019
Free text intent recognition to be extracted totally Rasa Open Source	2	849	November 3, 2018
Rasa don't work with russian language Rasa Open Source	3	1013	August 9, 2022

Rasa does not extract person names in cyrillic

Configuration for Rasa Core.

Policies

Related topics