Rasa not picking special characters in an entity

shubham1140 · January 30, 2020, 1:21pm

hi team,

Rasa nlu is unable to pic special characters in an entity such as ( alfred,rodger , europe/london ). can u help me with this @akelad
Thanks

Tanja · February 3, 2020, 9:26am

Hi @shubham1140!

( alfred,rodger , europe/london )

What special characters are you referring to? Can you point me to them?

nadachaabani1 · February 3, 2020, 10:17am

i have the same problem with special characters such as . in .net and + in c++

- Knowledge in [C++](competency), [Python](competency), [Linux](competency) and [GIT](competency)
- i am good with [c#](competency) , [.net](competency) and [react](competency)

 f"Misaligned entity annotation for '{collected_text}' "
c:\users\mega\appdata\local\programs\python\python36\lib\site-packages\rasa\nlu\extractors\crf_entity_extractor.py:533: UserWarning: Misaligned entity annotation for 'C' in sentence 'Knowledge 
in C++, Python, Linux and GIT, administration tools' with intent 'inform'. Make sure the start and end values of the annotated training examples end at token boundaries (e.g. don't include trailing whitespaces or punctuation).

  f"Misaligned entity annotation for '{collected_text}' "
c:\users\mega\appdata\local\programs\python\python36\lib\site-packages\rasa\nlu\extractors\crf_entity_extractor.py:533: UserWarning: Misaligned entity annotation for 'net' in sentence 'i am good with c# , .net and react' with intent 'inform'. Make sure the start and end values of the annotated training examples end at token boundaries (e.g. don't include trailing whitespaces or punctuation).

ganeshv · February 3, 2020, 10:58am

Hello @nadachaabani1, @shubham - there’s a potential solution for you here - Having trouble formatting training examples that contains a '-' or other punctuation signs.

tl;dr - there’s a regex that in typical cases ignores special characters as delimiters in strings. So cases where I would like the string 75-100 to be extracted into two entities 75 and 100, would fail. A solution would be to modify the regex to your specific need.

shubham1140 · February 3, 2020, 12:57pm

the issue with me is that i need to consider (/, ‘,’) as a part of single entity not as a separators. for example (city/state is my single entity ) and the problem is rasa nlu is not picking this entity due to (/) present in this entity. @ganeshv , @akelad

shubham1140 · February 3, 2020, 12:59pm

@Tanja special characters like ( ‘,’ , ‘/’ ) etc in an entity values are not picked by rasa nlu . other example is if i write (doesn’t) , this is also not picked although trained on this

Tanja · February 3, 2020, 6:42pm

As already pointed out by @ganeshv, we have a regex in place that splits words on those characters into separate tokens. So if you are using the WhitespaceTokenizer this will happen. If you want to keep the words, you can first of all try a different tokenizer or update the regex by writing a custom tokenizer (you can use the WhitespaceTokenizer as an example and just update the regex over there).

bayesianwannabe · March 20, 2020, 10:18pm

Hey shubham, how are you?

I am having the same issue… have you solved it? I wonder which tokenizer you are using… I am working with SpacyTokenizer but, even if the vanilla lib keeps stuff like 05/02 together, it seems that a different treatment is adopted on this component.

samscudder · April 1, 2020, 1:48pm

We use spacytokenizer and I’m not having any problems with “-” or “.” in examples.

sibbsnb · May 12, 2020, 10:52pm

yeah spacy solved my issue here to detect 1 charactor symbol

Topic		Replies	Views
Not able to extract entity when its consists special character Rasa Open Source	3	296	August 29, 2023
Add entity with specials character (/) Rasa Open Source	0	277	February 22, 2022
Special Character Support Rasa Open Source	0	698	August 5, 2019
SpacyEntityExtractor extracts additionally special characters Rasa Open Source	1	251	June 7, 2023
Rasa extracting entity with special characters Rasa Open Source	4	871	October 23, 2019

Rasa not picking special characters in an entity

Related topics