Suggestion for pipeline

Hi All,

Please suggest me the rasa-nlu pipeline so that if my text string be like:-

Ex:- “q”:“find product having A_1 id 34567 and B_2 id 5467 and C_1 id 456768”

So that it will automatically give me the entities like:-

“entities”: [ { “start”: 30, “end”: 37, “value”: “3567678”, “entity”: “A_1_val”, “confidence”: 0.9752557482894534, “extractor”: “ner_crf” }, { “start”: 51, “end”: 58, “value”: “5667797”, “entity”: “B_2_VAL”, “confidence”: 0.40931024015066964, “extractor”: “ner_crf” }, { “start”: 73, “end”: 80, “value”: “3233232”, “entity”: “C_3_VAL”, “confidence”: 0.9524791151899982, “extractor”: “ner_crf” }

sometime i am getting correct value but some time i am getting A_1_VAL only . I am using :-1: pipeline:

  • name: “tokenizer_whitespace”
  • name: “intent_entity_featurizer_regex”
  • name: “ner_crf”
  • name: “ner_synonyms”
  • name: “intent_featurizer_count_vectors”
  • name: “intent_classifier_tensorflow_embedding” intent_tokenization_flag: true intent_split_symbol: “+”

well, your pipeline is correct however all entities here are of the same type number meaning for the ner_crf, it is difficult to distinguish between them. you have to build a lot more examples to fetch all of them with good enough accuracy.

You could also use duckling to retrieve the numbers and manually associate an ID with it. Duckling uses regex so you are sure of fetching a number atleast and based on token positioning in the sentences, you can extract the ID perhaps