Wrong intent and entity detected if examples contains numbers as digit sequences

solyarisoftware · August 13, 2021, 3:05pm

I have two different intents to capture body temperature and oxygen saturation values:

Intent: body_temperature_data (containing body_temperature custom entity)
Intent: oxygen_saturation_data (containing oxygen_saturation custom entity)

Here below a data/intents.yml excerpt:

  - intent: body_temperature_data
    examples: |
      - sto bene
      - niente febbre
      - normale
      - sono senza febbre
      - non ho febbre
      - non mi sento la febbre
      - ho qualche linea
      - ho la febbre
      - credo di avere la febbre
      - mi sento un po di febbre
      - mi sento caldo
      - poca
      - molto poca
      - bassa
      - alta
      - molto alta
      - [35](body_temperature)
      - [35.5](body_temperature)
      - [35.6](body_temperature)
      - [35.7](body_temperature)
      - [35.8](body_temperature)
      - [35.9](body_temperature)
      - [35 e 9](body_temperature)
      - [35,9](body_temperature)
      - [36](body_temperature)
      - [36.0](body_temperature)
      - [36.1](body_temperature)
      - [36.2](body_temperature)
      - [36.3](body_temperature)
      - [36.4](body_temperature)
      - [36.5](body_temperature)
      - [36.6](body_temperature)
      - [36.7](body_temperature)
      - [36.8](body_temperature)
      - [36.9](body_temperature)
      - [36 e 7](body_temperature)
      - [36,9](body_temperature)
      - [37](body_temperature)
      - [37.0](body_temperature)
      - [37.1](body_temperature)
      - [37.2](body_temperature)
      - [37.3](body_temperature)
      - [37.4](body_temperature)
      - [37.5](body_temperature)
      - [37.6](body_temperature)
      - [37.7](body_temperature)
      - [37.8](body_temperature)
      - [37.9](body_temperature)
      - [37,2](body_temperature)
      - [37.5](body_temperature)
      - [37 , 6](body_temperature)
      - [37 . 6](body_temperature)
      - [38](body_temperature)
      - [38.0](body_temperature)
      - [38.1](body_temperature)
      - [38.2](body_temperature)
      - [38.3](body_temperature)
      - [38.4](body_temperature)
      - [38.5](body_temperature)
      - [38.6](body_temperature)
      - [38.7](body_temperature)
      - [38.8](body_temperature)
      - [38.9](body_temperature)
      - [38 , 1](body_temperature)
      - [38 . 2](body_temperature)
      - [39](body_temperature)
      - [39,1](body_temperature)
      - [39.1](body_temperature)
      - [39.2](body_temperature)
      - [39.3](body_temperature)
      - [39.5](body_temperature)
      - [39.6](body_temperature)
      - [39.7](body_temperature)
      - [39.8](body_temperature)
      - [39.9](body_temperature)
      - [40](body_temperature)
      - [41](body_temperature)
      - [trentacinque](body_temperature)
      - [trentasei](body_temperature)
      - [trentasei e otto](body_temperature)
      - [trentasette](body_temperature)
      - [trentasette emmezzo](body_temperature)
      - [trentasette e mezzo](body_temperature)
      - [trentasette punto otto](body_temperature)
      - [trentasette e quattro lineette](body_temperature)
      - [trentasette e 6 linee](body_temperature)
      - [trentasette virgola sei](body_temperature)
      - [trentasette punto sette](body_temperature)
      - [trentasette punto otto](body_temperature)
      - [trentotto](body_temperature)
      - [trentotto punto uno](body_temperature)
      - [trentotto  e 2 linee](body_temperature)
      - [trentotto e due](body_temperature)
      - [trentotto punto tre](body_temperature)
      - [trentotto e quattro](body_temperature)
      - [trentotto virgola quattro](body_temperature)
      - [trentotto emmezzo](body_temperature)
      - [trentanove](body_temperature)
      - [trentanove e due](body_temperature)
      - [trentanove emmezzo](body_temperature)
      - [quaranta](body_temperature)
      - [quarantuno](body_temperature)

  - intent: oxygen_saturation_data
    examples: |
      - [70](oxygen_saturation)
      - [71](oxygen_saturation)
      - [72](oxygen_saturation)
      - [73](oxygen_saturation)
      - [74](oxygen_saturation)
      - [75](oxygen_saturation)
      - [76](oxygen_saturation)
      - [77 e 9](oxygen_saturation)
      - [78,7](oxygen_saturation)
      - [79](oxygen_saturation)
      - [80](oxygen_saturation)
      - [80.5](oxygen_saturation)
      - [81](oxygen_saturation)
      - [81.6](oxygen_saturation)
      - [82](oxygen_saturation)
      - [82.4](oxygen_saturation)
      - [83](oxygen_saturation)
      - [83.7](oxygen_saturation)
      - [84](oxygen_saturation)
      - [84.1](oxygen_saturation)
      - [85](oxygen_saturation)
      - [85.2](oxygen_saturation)
      - [86](oxygen_saturation)
      - [86.9](oxygen_saturation)
      - [87](oxygen_saturation)
      - [87.8](oxygen_saturation)
      - [88](oxygen_saturation)
      - [88.0](oxygen_saturation)
      - [88.1](oxygen_saturation)
      - [89.0](oxygen_saturation)
      - [89](oxygen_saturation)
      - [89.7](oxygen_saturation)
      - [90](oxygen_saturation)
      - [90.0](oxygen_saturation)
      - [90.1](oxygen_saturation)
      - [90.2](oxygen_saturation)
      - [90.3](oxygen_saturation)
      - [90.4](oxygen_saturation)
      - [90.5](oxygen_saturation)
      - [90.6](oxygen_saturation)
      - [90.7](oxygen_saturation)
      - [90.8](oxygen_saturation)
      - [90.9](oxygen_saturation)
      - [91](oxygen_saturation)
      - [91.6](oxygen_saturation)
      - [92](oxygen_saturation)
      - [92.9](oxygen_saturation)
      - [93](oxygen_saturation)
      - [93.8](oxygen_saturation)
      - [94](oxygen_saturation)
      - [94.5](oxygen_saturation)
      - [95](oxygen_saturation)
      - [95.4](oxygen_saturation)
      - [96](oxygen_saturation)
      - [96.7](oxygen_saturation)
      - [97](oxygen_saturation)
      - [97.5](oxygen_saturation)
      - [98](oxygen_saturation)
      - [98.4](oxygen_saturation)
      - [99](oxygen_saturation)
      - [99 e 1](oxygen_saturation)
      - [99.0](oxygen_saturation)
      - [99.9](oxygen_saturation)
      - [100](oxygen_saturation)
      - [settanta](oxygen_saturation)
      - [settantuno](oxygen_saturation)
      - [settantadue](oxygen_saturation)
      - [settantatre](oxygen_saturation)
      - [settantaquattro](oxygen_saturation)
      - [settantacinque](oxygen_saturation)
      - [settantasei](oxygen_saturation)
      - [settantasette](oxygen_saturation)
      - [settantotto](oxygen_saturation)
      - [settantanove](oxygen_saturation)
      - [ottanta](oxygen_saturation)
      - [ottantuno](oxygen_saturation)
      - [ottantadue](oxygen_saturation)
      - [ottantatre](oxygen_saturation)
      - [ottantatre punto cinque](oxygen_saturation)
      - [ottantatre punto sei](oxygen_saturation)
      - [ottantaquattro emmezzo](oxygen_saturation)
      - [ottantaquattro e sei](oxygen_saturation)
      - [ottantaquattro punto sette](oxygen_saturation)
      - [ottantaquattro punto sei](oxygen_saturation)
      - [ottantaquattro punto nove](oxygen_saturation)
      - [ottantacinque](oxygen_saturation)
      - [ottantacinque punto cinque](oxygen_saturation)
      - [ottantacinque e quattro](oxygen_saturation)
      - [ottantasei](oxygen_saturation)
      - [ottantasette](oxygen_saturation)
      - [ottantotto](oxygen_saturation)
      - [ottantanove](oxygen_saturation)
      - [novanta](oxygen_saturation)
      - [novanta punto tre](oxygen_saturation)
      - [novanta punto otto](oxygen_saturation)
      - [novantuno](oxygen_saturation)
      - [novantuno punto cinque](oxygen_saturation)
      - [novantuno punto nove](oxygen_saturation)
      - [novantadue](oxygen_saturation)
      - [novantatre](oxygen_saturation)
      - [novantatre e sei](oxygen_saturation)
      - [novantatre punto sette](oxygen_saturation)
      - [novantatre virgola otto](oxygen_saturation)
      - [novantatre e due](oxygen_saturation)
      - [novantatre punto uno](oxygen_saturation)
      - [novantatre virgola nove](oxygen_saturation)
      - [novantaquattro](oxygen_saturation)
      - [novantaquattro virgola due](oxygen_saturation)
      - [novantaquattro virgola otto](oxygen_saturation)
      - [novantacinque](oxygen_saturation)
      - [novantacinque e cinque](oxygen_saturation)
      - [novantacinque punto cinque](oxygen_saturation)
      - [novantasei](oxygen_saturation)
      - [novantasei e uno](oxygen_saturation)
      - [novantasei e cinque](oxygen_saturation)
      - [novantasette](oxygen_saturation)
      - [novantasette e due](oxygen_saturation)
      - [novantasette e sei](oxygen_saturation)
      - [novantotto](oxygen_saturation)
      - [novantotto e cinque](oxygen_saturation)
      - [novantanove](oxygen_saturation)
      - [novantanove emmezzo](oxygen_saturation)
      - [cento](oxygen_saturation)

As the examples show, I would like to get entities values (and afterward slots in a form) expressed as

numbers as digits (35.5), possibly texted by user on a chat messaging channel
numbers as letters (trentacinque punto cinque), possibly inputed via speech so the speech recognition engine returns generally a literal transcript for numbers.

See what happens if I test the RASA NLU:

$ rasa shell nlu --quiet
NLU model loaded. Type a message and press enter to parse it.
Next message:
90.3
{
  "text": "90.3",
  "intent": {
    "id": -6401318193538980427,
    "name": "body_temperature_data",
    "confidence": 0.7580949664115906
  },
  "entities": [
    {
      "entity": "body_temperature",
      "start": 0,
      "end": 4,
      "confidence_entity": 0.7051703929901123,
      "value": "90.3",
      "extractor": "DIETClassifier"
    }
  ],
  "intent_ranking": [
    {
      "id": -6401318193538980427,
      "name": "body_temperature_data",
      "confidence": 0.7580949664115906
    },
    {
      "id": 8358940020600517004,
      "name": "oxygen_saturation_data",
      "confidence": 0.18363729119300842
    },
    {
      "id": -860430617479998517,
      "name": "mood_unhappy",
      "confidence": 0.010874141938984394
    },
Next message:
novanta punto tre
{
  "text": "novanta punto tre",
  "intent": {
    "id": 8358940020600517004,
    "name": "oxygen_saturation_data",
    "confidence": 0.9999997615814209
  },
  "entities": [
    {
      "entity": "oxygen_saturation",
      "start": 0,
      "end": 17,
      "confidence_entity": 0.9956320524215698,
      "value": "novanta punto tre",
      "extractor": "DIETClassifier"
    }
  ],
  "intent_ranking": [
    {
      "id": 8358940020600517004,
      "name": "oxygen_saturation_data",
      "confidence": 0.9999997615814209
    },
    {
      "id": -860430617479998517,
      "name": "mood_unhappy",
      "confidence": 4.465892544658345e-08
    },

So what happens is that if numbers are inserted as words/letters, RASA classify correctly intent oxygen_saturation_data and entity oxygen_saturation. So far, so good.

But If I insert numbers by digits (e.g. 90.3), the intent and entity are wrong classified.

This surprise me because the examples set of two intents body_temperature and oxygen_saturation are two completely separated set of texts!

My question is WHY intent/entity is wrongly classified?

BTW, I tried to add quotation marks in examples:

  - ['35.5'](oxygen_saturation)

Instead of:

  - [35.5](oxygen_saturation)

but this rise this error/warning at train time:

/home/giorgio/.local/lib/python3.8/site-packages/rasa/shared/utils/io.py:97: UserWarning: Misaligned entity annotation in message ‘‘35.5’’ with intent ‘body_temperature_data’. Make sure the start and end values of entities ([(0, 6, “‘35.5’”)]) in the training data match the token boundaries ([(0, 5, “'35.5”)]). Common causes:

entities include trailing whitespaces or punctuation

the tokenizer gives an unexpected result, due to languages such as Chinese that don’t use whitespace for word separation More info at Training Data Format

My doubt is about having numbers (e.g. floating numbers as digits strings as 35.5) as entities (and intents examples). Could be this the reason why RASA NLU fails (see rasa shell nlu report above)?

Any idea?

$ cat config.yml

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: it

pipeline:

  # pip3 install rasa[spacy]
  # python3 -m spacy download it_core_news_sm
  # python3 -m spacy download it_core_news_lg
  # https://rasa.com/docs/rasa/components#spacynlp
  - name: "SpacyNLP"
    # language model to load
    # italian large model: it_core_news_lg
    # italian small model: it_core_news_sm
    model: "it_core_news_sm"
    # when retrieving word vectors, this will decide if the casing
    # of the word is relevant. E.g. `hello` and `Hello` will
    # retrieve the same vector, if set to `False`. For some
    # applications and models it makes sense to differentiate
    # between these two words, therefore setting this to `True`.
    case_sensitive: false

  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4

  - name: DIETClassifier
    epochs: 100
    constrain_similarities: true

  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
    constrain_similarities: true

  - name: FallbackClassifier
    threshold: 0.3
    ambiguity_threshold: 0.1

policies:

Thanks

nik202 · August 13, 2021, 4:24pm

@solyarisoftware Hi, nice question and query! What is your slot type of oxygen_saturation I guess text? just checking.

solyarisoftware · August 13, 2021, 6:46pm

Yes @nik202, thanks

entities:
  - body_temperature
  - oxygen_saturation
 
slots:
  body_temperature:
    type: text
    auto_fill: false
  oxygen_saturation:
    type: text
    auto_fill: false
 
forms:
  health_monitoring_form:
    required_slots:
    ⦙ body_temperature:
    ⦙   - type: from_text
    ⦙   ⦙ intent_name: body_temperature_data
    ⦙ oxygen_saturation:
    ⦙   - type: from_text
    ⦙   ⦙ intent_name: oxygen_saturation_data

BTW, in the form, as you see, slots refer to intents… that’s solve the issue but it’s a workaround I guess, whereas the default choice would be to get slots from corresponding entities. but in that way I got problems as you imagine.

Thanks for the help

nik202 · August 13, 2021, 7:01pm

@solyarisoftware great, you solved the query .

solyarisoftware · August 13, 2021, 9:27pm

but the initial problem (intent/entity confusion) is still there.

nik202 · August 13, 2021, 9:30pm

@solyarisoftware slot type float? have tried that? or type any?

solyarisoftware · August 13, 2021, 9:46pm

I’d exclude type float because I wish to take vales in numeric and literal form, e.g.

93.4
ninety three point four

but tomorrow I’ll try type any. thanks

So having set type text… this doesn’t allow RASA to classify correctly numbers? Weird

nik202 · August 13, 2021, 9:46pm

@solyarisoftware good luck!

solyarisoftware · August 14, 2021, 10:09am

BTW, I think RASA NLU wrong intent/entity prediction is NOT related to slots configuration. `rasa shell nlu’ report seems to suggest that the issue is just related to NLU problems with numbers.

solyarisoftware · August 15, 2021, 4:24pm

I maybe solved the issue. Results are now absolutely satisying (see rasa shell nlu tests at the end).

What solved has been

defining custom entities body_temperature and oxygen_saturation as lookup lists. See file `data/entities.yml’
updating intents body_temperature_data and oxygen_saturation_data to contains more examples an variety of entities.

So I understand that the ENTITIES examples training don’t work if I put these examples inside the INTENTS examples. At list not only…

What is also not clear to me is this: following RASA docs, entities lookup tables are managed as regexp lists… so I do not full understand how the RASA DIETClassifier instead is able to classify correctly bad written entities (mispelled, etc.)

$ cat domain.yml (excerpt)

version: "2.0"

intents:

  # patient_monitoring application specific intents
  - health_monitoring_request

  - body_temperature_data:
      use_entities:
        - body_temperature

  - oxygen_saturation_data:
      use_entities:
        - oxygen_saturation

$ cat data/entities.yml (excerpt)

version: "2.0"

nlu:

  - lookup: body_temperaturea
    examples: |
      - 35
      - 35.0
      - 35.5
      - 35.5
      - 35.6
      - 35.7
      - 35.8
      - 35.9
      - 36
      - 36.0
      - 36.1
      - 36.2
      - 36.3
      - 36.4
      - 36.5
      - 36.6
      - 36.7
      - 36.8
      - 36.9
      - 37
      - 37.0
      - 37.1
      - 37.2
      - 37.3
      - 37.4
      - 37.5
      - 37.6
      - 37.7
      - 37.8
      - 37.9
      - 38
      - 38.0
      - 38.1
      - 38.2
      - 38.3
      - 38.4
      - 38.5
      - 38.6
      - 38.7
      - 38.8
      - 38.9
      - 39
      - 39.1
      - 39.2
      - 39.3
      - 39.5
      - 39.6
      - 39.7
      - 39.8
      - 39.9
      - 40
      - 41
      - trentacinque
      - trentasei
      - trentasei punto uno
      - trentasei punto due
      - trentasei punto tre
      - trentasei punto quattro
      - trentasei punto cinque
      - trentasei punto sei
      - trentasei punto sette
      - trentasei punto otto
      - trentasei punto nove
      - trentasette
      - trentasette punto uno
      - trentasette punto due
      - trentasette punto tre
      - trentasette punto quattro
      - trentasette punto cinque
      - trentasette punto sei
      - trentasette punto sette
      - trentasette punto otto
      - trentasette punto nove
      - trentasette punto sette
      - trentasette punto otto
      - trentasette punto nove
      - trentotto
      - trentotto punto uno
      - trentotto punto due
      - trentotto punto tre
      - trentotto punto quattro
      - trentotto punto cinque
      - trentotto punto sei
      - trentotto punto sette
      - trentotto punto otto
      - trentotto punto nove
      - trentotto punto sette
      - trentotto punto otto
      - trentotto punto nove
      - trentotto punto uno
      - trentotto punto due
      - trentotto punto tre
      - trentotto punto quattro
      - trentotto punto cinque
      - trentotto punto sei
      - trentotto punto sette
      - trentotto punto otto
      - trentotto punto nove
      - trentotto punto sette
      - trentotto punto otto
      - trentotto punto nove
      - trentanove
      - trentanove punto uno
      - trentanove punto due
      - trentanove punto tre
      - trentanove punto quattro
      - trentanove punto cinque
      - trentanove punto sei
      - trentanove punto sette
      - trentanove punto otto
      - trentanove punto nove
      - trentanove punto sette
      - trentanove punto otto
      - trentanove punto nove
      - quaranta
      - quarantuno

  - lookup: oxygen_saturation
    examples: |
      - 70
      - 71
      - 72
      - 73
      - 74
      - 75
      - 76
      - 77
      - 78
      - 79
      - 80
      - 80.1
      - 80.2
      - 80.3
      - 80.4
      - 80.5
      - 80.6
      - 80.7
      - 80.8
      - 80.9
      - 81
      - 81.1
      - 81.2
      - 81.3
      - 81.4
      - 81.5
      - 81.6
      - 81.7
      - 81.8
      - 81.9
      - 82
      - 82.1
      - 82.2
      - 82.3
      - 82.4
      - 82.5
      - 82.6
      - 82.7
      - 82.8
      - 82.9
      - 83
      - 83.1
      - 83.2
      - 83.3
      - 83.4
      - 83.5
      - 83.6
      - 83.7
      - 83.8
      - 83.9
      - 84
      - 84.1
      - 84.2
      - 84.3
      - 84.4
      - 84.5
      - 84.6
      - 84.7
      - 84.8
      - 84.9
      - 85
      - 85.1
      - 85.2
      - 85.3
      - 85.4
      - 85.5
      - 85.6
      - 85.7
      - 85.8
      - 85.9
      - 86
      - 86.1
      - 86.2
      - 86.3
      - 86.4
      - 86.5
      - 86.6
      - 86.7
      - 86.8
      - 86.9
      - 87
      - 87.1
      - 87.2
      - 87.3
      - 87.4
      - 87.5
      - 87.6
      - 87.7
      - 87.8
      - 87.9
      - 88
      - 88.1
      - 88.2
      - 88.3
      - 88.4
      - 88.5
      - 88.6
      - 88.7
      - 88.8
      - 88.9
      - 89
      - 89.1
      - 89.2
      - 89.3
      - 89.4
      - 89.5
      - 89.6
      - 89.7
      - 89.8
      - 89.9
      - 90
      - 90.1
      - 90.2
      - 90.3
      - 90.4
      - 90.5
      - 90.6
      - 90.7
      - 90.8
      - 90.9
      - 91
      - 91.1
      - 91.2
      - 91.3
      - 91.4
      - 91.5
      - 91.6
      - 91.7
      - 91.8
      - 91.9
      - 92
      - 92.1
      - 92.2
      - 92.3
      - 92.4
      - 92.5
      - 92.6
      - 92.7
      - 92.8
      - 92.9
      - 93
      - 93.1
      - 93.2
      - 93.3
      - 93.4
      - 93.5
      - 93.6
      - 93.7
      - 93.8
      - 93.9
      - 94
      - 94.1
      - 94.2
      - 94.3
      - 94.4
      - 94.5
      - 94.6
      - 94.7
      - 94.8
      - 94.9
      - 95
      - 95.1
      - 95.2
      - 95.3
      - 95.4
      - 95.5
      - 95.6
      - 95.7
      - 95.8
      - 95.9
      - 96
      - 96.1
      - 96.2
      - 96.3
      - 96.4
      - 96.5
      - 96.6
      - 96.7
      - 96.8
      - 96.9
      - 97
      - 97.1
      - 97.2
      - 97.3
      - 97.4
      - 97.5
      - 97.6
      - 97.7
      - 97.8
      - 97.9
      - 98
      - 98.1
      - 98.2
      - 98.3
      - 98.4
      - 98.5
      - 98.6
      - 98.7
      - 98.8
      - 98.9
      - 99
      - 99.1
      - 99.2
      - 99.3
      - 99.4
      - 99.5
      - 99.6
      - 99.7
      - 99.8
      - 99.9
      - 100
      - settanta
      - settantuno
      - settantadue
      - settantatre
      - settantaquattro
      - settantacinque
      - settantasei
      - settantasette
      - settantotto
      - settantanove
      - ottanta
      - ottantuno
      - ottantadue
      - ottantatre
      - ottantatre punto uno
      - ottantatre punto due
      - ottantatre punto tre
      - ottantatre punto quattro
      - ottantatre punto cinque
      - ottantatre punto sei
      - ottantatre punto sette
      - ottantatre punto otto
      - ottantatre punto nove
      - ottantaquattro
      - ottantaquattro punto uno
      - ottantaquattro punto due
      - ottantaquattro punto tre
      - ottantaquattro punto quattro
      - ottantaquattro punto cinque
      - ottantaquattro punto sei
      - ottantaquattro punto sette
      - ottantaquattro punto otto
      - ottantaquattro punto nove
      - ottantacinque
      - ottantacinque punto uno
      - ottantacinque punto due
      - ottantacinque punto tre
      - ottantacinque punto quattro
      - ottantacinque punto cinque
      - ottantacinque punto sei
      - ottantacinque punto sette
      - ottantacinque punto otto
      - ottantacinque punto nove
      - ottantasei
      - ottantasei punto uno
      - ottantasei punto due
      - ottantasei punto tre
      - ottantasei punto quattro
      - ottantasei punto cinque
      - ottantasei punto sei
      - ottantasei punto sette
      - ottantasei punto otto
      - ottantasei punto nove
      - ottantasette
      - ottantasette punto uno
      - ottantasette punto due
      - ottantasette punto tre
      - ottantasette punto quattro
      - ottantasette punto cinque
      - ottantasette punto sei
      - ottantasette punto sette
      - ottantasette punto otto
      - ottantasette punto nove
      - ottantotto
      - ottantotto punto uno
      - ottantotto punto due
      - ottantotto punto tre
      - ottantotto punto quattro
      - ottantotto punto cinque
      - ottantotto punto sei
      - ottantotto punto sette
      - ottantotto punto otto
      - ottantotto punto nove
      - ottantanove
      - ottantanove punto uno
      - ottantanove punto due
      - ottantanove punto tre
      - ottantanove punto quattro
      - ottantanove punto cinque
      - ottantanove punto sei
      - ottantanove punto sette
      - ottantanove punto otto
      - ottantanove punto nove
      - novanta
      - novanta punto uno
      - novanta punto due
      - novanta punto tre
      - novanta punto quattro
      - novanta punto cinque
      - novanta punto sei
      - novanta punto sette
      - novanta punto otto
      - novanta punto nove
      - novantuno
      - novantuno punto uno
      - novantuno punto due
      - novantuno punto tre
      - novantuno punto quattro
      - novantuno punto cinque
      - novantuno punto sei
      - novantuno punto sette
      - novantuno punto otto
      - novantuno punto nove
      - novantadue
      - novantadue punto uno
      - novantadue punto due
      - novantadue punto tre
      - novantadue punto quattro
      - novantadue punto cinque
      - novantadue punto sei
      - novantadue punto sette
      - novantadue punto otto
      - novantadue punto nove
      - novantatre
      - novantatre punto uno
      - novantatre punto due
      - novantatre punto tre
      - novantatre punto quattro
      - novantatre punto cinque
      - novantatre punto sei
      - novantatre punto sette
      - novantatre punto otto
      - novantatre punto nove
      - novantaquattro
      - novantaquattro punto uno
      - novantaquattro punto due
      - novantaquattro punto tre
      - novantaquattro punto quattro
      - novantaquattro punto cinque
      - novantaquattro punto sei
      - novantaquattro punto sette
      - novantaquattro punto otto
      - novantaquattro punto nove
      - novantacinque
      - novantacinque punto uno
      - novantacinque punto due
      - novantacinque punto tre
      - novantacinque punto quattro
      - novantacinque punto cinque
      - novantacinque punto sei
      - novantacinque punto sette
      - novantacinque punto otto
      - novantacinque punto nove
      - novantasei
      - novantasei punto uno
      - novantasei punto due
      - novantasei punto tre
      - novantasei punto quattro
      - novantasei punto cinque
      - novantasei punto sei
      - novantasei punto sette
      - novantasei punto otto
      - novantasei punto nove
      - novantasette
      - novantasette punto uno
      - novantasette punto due
      - novantasette punto tre
      - novantasette punto quattro
      - novantasette punto cinque
      - novantasette punto sei
      - novantasette punto sette
      - novantasette punto otto
      - novantasette punto nove
      - novantotto
      - novantotto punto uno
      - novantotto punto due
      - novantotto punto tre
      - novantotto punto quattro
      - novantotto punto cinque
      - novantotto punto sei
      - novantotto punto sette
      - novantotto punto otto
      - novantotto punto nove
      - novantanove
      - novantanove punto uno
      - novantanove punto due
      - novantanove punto tre
      - novantanove punto quattro
      - novantanove punto cinque
      - novantanove punto sei
      - novantanove punto sette
      - novantanove punto otto
      - novantanove punto nove
      - cento

$ cat data/intents.yml (excerpt)

nlu:
  - intent: body_temperature_data
    examples: |
      - sto bene
      - niente febbre
      - normale
      - sono senza febbre
      - non ho febbre
      - non mi sento la febbre
      - ho qualche linea
      - ho la febbre
      - credo di avere la febbre
      - mi sento un po di febbre
      - mi sento caldo
      - poca
      - molto poca
      - bassa
      - alta
      - molto alta
      - temperatura: [35](body_temperature) gradi
      - la temperatura è [ovantadue](oxygen_saturation) gradi
      - [35.5](body_temperature)
      - [35.9](body_temperature) gradi
      - [35 e 9](body_temperature)
      - [35,9](body_temperature)
      - [36](body_temperature) esatti
      - [36.0](body_temperature)
      - [36 e 7](body_temperature)
      - [36 , 8](body_temperature)
      - [36,9](body_temperature)
      - [37](body_temperature) gradi esatti
      - [37.9](body_temperature)
      - [37,2](body_temperature)
      - [37.5](body_temperature) gradi
      - [37 , 6](body_temperature)
      - [37 . 6](body_temperature)
      - [38 e 8](body_temperature)
      - [38 , 1](body_temperature)
      - [38 . 2](body_temperature)
      - [39 e 3](body_temperature) linee
      - [trentasei e otto](body_temperature) linee
      - la temperatura è [trentasette](body_temperature)
      - [trentasette emmezzo](body_temperature)
      - [trentasette e mezzo](body_temperature)
      - [trentasette punto otto](body_temperature)
      - [trentasette e quattro lineette](body_temperature)
      - [trentasette e 6](body_temperature) linee
      - [trentasette virgola sei](body_temperature)
      - ho [trentotto](body_temperature) di temperatura
      - ho [trentotto  e 2](body_temperature) linee
      - [trentotto e due](body_temperature)
      - [trentotto virgola tre](body_temperature)
      - [trentotto e quattro](body_temperature)
      - [trentotto virgola quattro](body_temperature)
      - [trentotto emmezzo](body_temperature)
      - [trentanove e due](body_temperature)
      - [trentanove emmezzo](body_temperature)
      - [quaranta](body_temperature)
      - [quarantuno](body_temperature)

  - intent: oxygen_saturation_data
    examples: |
      - il valore è [70](oxygen_saturation)
      - ho [76 e 5](oxygen_saturation)
      - [77 e 9](oxygen_saturation)
      - [78,7](oxygen_saturation)
      - [95](oxygen_saturation)
      - [95.4](oxygen_saturation)
      - [96](oxygen_saturation)
      - [97](oxygen_saturation)
      - [97.5](oxygen_saturation)
      - esattamente [98](oxygen_saturation)
      - [98,4](oxygen_saturation)
      - [99 e mezzo](oxygen_saturation)
      - [99 e 1](oxygen_saturation)
      - [99](oxygen_saturation) preciso
      - [99.9](oxygen_saturation)
      - [100](oxygen_saturation)
      - circa [settantotto](oxygen_saturation)
      - quasi [settantanove](oxygen_saturation)
      - [ottanta](oxygen_saturation)
      - precisamentee [settantasette](oxygen_saturation)
      - [ottantatre emmezzo](oxygen_saturation)
      - [ottantatre virgola sei](oxygen_saturation)
      - [ottantaquattro emmezzo](oxygen_saturation)
      - [ottantaquattro e sei](oxygen_saturation)
      - [ottantaquattro virgola sette](oxygen_saturation)
      - [ottantaquattro virgola sei](oxygen_saturation)
      - [ottantaquattro e nove](oxygen_saturation)
      - [puntopuntiottantacinque](oxygen_saturation)
      - [ottantacinque punto cinque](oxygen_saturation)
      - [ottantacinque e quattro](oxygen_saturation)
      - proprio [ottantasei](oxygen_saturation)
      - il valore è [ottantasette](oxygen_saturation) preciso
      - [ottantotto](oxygen_saturation) tondi
      - giusto [ottantanove](oxygen_saturation)
      - [novanta](oxygen_saturation) giusto
      - [novanta e tre](oxygen_saturation)
      - [novanta virgola otto](oxygen_saturation)
      - [novantuno](oxygen_saturation)
      - [novantuno e nove](oxygen_saturation)
      - [novantadue](oxygen_saturation)
      - [novantadue emmezzo](oxygen_saturation)
      - [novantatre e sei](oxygen_saturation)
      - [novantatre virgola otto](oxygen_saturation)
      - [novantatre e due](oxygen_saturation)
      - [novantatre virgola nove](oxygen_saturation)
      - [novantaquattro](oxygen_saturation)
      - [novantaquattro punto due](oxygen_saturation)
      - [novantaquattro virgola otto](oxygen_saturation)
      - [novantacinque](oxygen_saturation)
      - [novantacinque e cinque](oxygen_saturation)
      - [novantacinque punto cinque](oxygen_saturation)
      - [novantasei](oxygen_saturation)
      - [novantasei e uno](oxygen_saturation)
      - [novantasei e cinque](oxygen_saturation)
      - [novantasette](oxygen_saturation)
      - [novantasette punto due](oxygen_saturation)
      - [novantasette e sei](oxygen_saturation)
      - [novantotto](oxygen_saturation)
      - [novantotto e cinque](oxygen_saturation)
      - [novantanove](oxygen_saturation)
      - [novantanove emmezzo](oxygen_saturation)
      - [cento](oxygen_saturation)

$ rasa shell nlu

2021-08-15 18:01:31 INFO     rasa.model  - Loading model models/20210815-175244.tar.gz...
2021-08-15 18:01:36 INFO     rasa.nlu.components  - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-it_core_news_sm'.
NLU model loaded. Type a message and press enter to parse it.
Next message:
90.3
{
  "text": "90.3",
  "intent": {
    "id": 7258451808749595588,
    "name": "oxygen_saturation_data",
    "confidence": 0.9954347610473633
  },
  "entities": [
    {
      "entity": "oxygen_saturation",
      "start": 0,
      "end": 4,
      "confidence_entity": 0.9767475724220276,
      "value": "90.3",
      "extractor": "DIETClassifier"
    }
  ],

Next message:
novnt virgola tri
{
  "text": "novnt virgola tri",
  "intent": {
    "id": 7258451808749595588,
    "name": "oxygen_saturation_data",
    "confidence": 0.9423763155937195
  },
  "entities": [
    {
      "entity": "oxygen_saturation",
      "start": 6,
      "end": 13,
      "confidence_entity": 0.8582544922828674,
      "value": "virgola",
      "extractor": "DIETClassifier"
    }
  ],

Next message:
ho trentasette gradi giusti
{
  "text": "ho trentasette gradi giusti",
  "intent": {
    "id": -3485198568115343109,
    "name": "body_temperature_data",
    "confidence": 0.9999998807907104
  },
  "entities": [
    {
      "entity": "body_temperature",
      "start": 3,
      "end": 14,
      "confidence_entity": 0.9392516613006592,
      "value": "trentasette",
      "extractor": "DIETClassifier"
    }
  ],

ho trentotto emmezzo di febbre
{
  "text": "ho trentotto emmezzo di febbre",
  "intent": {
    "id": -3485198568115343109,
    "name": "body_temperature_data",
    "confidence": 1.0
  },
  "entities": [
    {
      "entity": "body_temperature",
      "start": 3,
      "end": 20,
      "confidence_entity": 0.7569668889045715,
      "value": "trentotto emmezzo",
      "extractor": "DIETClassifier"
    }
  ],
Next message:
38 virgola cinque
{
  "text": "38 virgola cinque",
  "intent": {
    "id": -583108352491052127,
    "name": "body_temperature_data",
    "confidence": 0.9998660683631897
  },
  "entities": [
    {
      "entity": "body_temperature",
      "start": 0,
      "end": 17,
      "confidence_entity": 0.8543928861618042,
      "value": "38 virgola cinque",
      "extractor": "DIETClassifier"
    }
  ],
Next message:
la teperatura è di trentotto gradi esatti
{
  "text": "la teperatura è di trentotto gradi esatti",
  "intent": {
    "id": -583108352491052127,
    "name": "body_temperature_data",
    "confidence": 0.9999998807907104
  },
  "entities": [
    {
      "entity": "body_temperature",
      "start": 19,
      "end": 28,
      "confidence_entity": 0.978223979473114,
      "value": "trentotto",
      "extractor": "DIETClassifier"
    }
  ],
Next message:
temperatura: ternt'otto gradi emmezzo
{
  "text": "temperatura: ternt'otto gradi emmezzo",
  "intent": {
    "id": -583108352491052127,
    "name": "body_temperature_data",
    "confidence": 0.9999991059303284
  },
  "entities": [
    {
      "entity": "oxygen_saturation",
      "start": 30,
      "end": 37,
      "confidence_entity": 0.7737781405448914,
      "value": "emmezzo",
      "extractor": "DIETClassifier"
    }
  ],

See also the blog post: 10 Best Practices for Designing NLU Training Data

nik202 · August 15, 2021, 6:58pm

@solyarisoftware congratulates , Finally. Please close this thread with your solution, which you shared in last post and good luck with you project.

Topic		Replies	Views
Intent is classified correctly but the entity recognized is not matching Rasa Open Source	5	108	May 2, 2024
What is going on this. Entity not listed in the intent is confident and call respective action Rasa Open Source	0	6	August 13, 2024
Are entity-only training examples still supported? If so, how are they formatted? Rasa Open Source	1	502	December 12, 2019
Detecting entity from a intent not describing it Rasa Open Source	3	328	August 24, 2020
Need help: intent and entity not correct Rasa Open Source	2	770	July 4, 2019

Wrong intent and entity detected if examples contains numbers as digit sequences

My question is WHY intent/entity is wrongly classified?

Related topics