Entity not recognizing correctly

Hi

I have a numeric data with tax id(numeric and length 9),routing number (numeric and length 9) and account number (numeric and length 5 to 15) but in some cases routing number is recognized as account number or tax id and account number as routing number or tax id and slots are getting set

Can you please let me know how to fix this problem

i am using forms in actions.py file because of this when ever i give some number it is setting any of the above slots by extracting entity from the input (even though that entity is not correct) without executing validation in forms due to which that slot gets skipped in forms as it is not none so can you please tell me how can i make it to validate slot before setting it with entity

Also please let me know how can i take text free with out checking it on NLU data as i have a name slots in my form

Below is training data,config file,actions.py file attached.

data nlu:

intent:inform

config.yml:

Configuration for Rasa NLU.

Components

language: en pipeline: supervised_embeddings

Configuration for Rasa Core.

Policies

policies:

  • name: MemoizationPolicy max_history: 10
  • name: KerasPolicy max_history: 10
  • name: MappingPolicy
  • name: FormPolicy
  • name: FallbackPolicy nlu_threshold: 0.5 core_threshold: 0.5 fallback_action_name: “action_custom_fallback”

actions.py: class UserLoginForm2(FormAction): “”“Example of a custom form action”""

    def name(self):
        """Unique identifier of the form"""

        return "user_easypay2_form"

    @staticmethod
    def required_slots(tracker: Tracker) -> List[Text]:
            """A list of required slots that the form has to fill"""
            print("*************required slots form has to fill started 2****************")
            return ["accholdername","routingnumber","accountno"]

    def slot_mappings(self):
        # type: () -> Dict[Text: Union[Dict, List[Dict]]]
        """A dictionary to map required slots to
            - an extracted entity
            - intent: value pairs
            - a whole message
            or a list of them, where a first match will be picked"""
        print("*****slot mappings*******")
        return {"accholdername": [self.from_text()],
                "routingnumber": [
                    self.from_entity(
                        entity="routingnumber", intent=["inform"]
                    ),
                    self.from_entity(entity="number")
                ],
                "accountno": [
                    self.from_entity(
                        entity="accountno", intent=["inform"]
                    ),
                    self.from_entity(entity="number")
                ]}
    
    def validate_slots(self,
                   slot_dict: Dict,
                   dispatcher: CollectingDispatcher,
                   tracker: Tracker,
                   domain: Dict[Text, Any]) -> List[Dict]:
          print("Enters into validation part in which we will validate the slots entered in form2")
          slot_values = self.extract_other_slots(dispatcher, tracker, domain)
          print("slot_values",slot_values)

          slot_to_fill = tracker.get_slot(REQUESTED_SLOT)
          print("slot_to_fill",slot_to_fill)
       
          if slot_to_fill:

            slot_values.update(self.extract_requested_slot(dispatcher, tracker, domain))
            #print("slot values",slot_values)
            for slot, value in slot_values.items():
              print("******** Slot Validation started for form 2 **********")
              ##common_url=tracker.get_slot('common_url')
              print("******** Slot Validation started for form 2 **********")
                
              if slot == 'accholdername':
                    print("value in accounterholdername",value)
                    if(value.isalpha()):
                      print("Account holder name does not contain any alpha numeric characters")
                    else:
                      dispatcher.utter_message("Bank Account Holder's Name must be alphabatic")
                      print("Account holder name has alpha numeric characters or numbers")
                      dispatcher.utter_template('utter_ask_accholdername', tracker)
                      slot_values[slot] = None  
        
              elif slot == 'routingnumber':
                    print("value for routing number is",value,type(value))
                    try:
                        Num=int(value)
                        print("Roating number is Numeric",Num,type(Num))
                    except:
                        print("Error",sys.exc_info()[0])
                        print("Value conversion to numeric fail")
                        dispatcher.utter_template('Routing Number must be numeric', tracker)
                        slot_values[slot] = None 

                    if (value.isnumeric()):
                            print("Roating number is numeric")
                            if (len(value)==9):
                                print("Roating number length matches")
                                ## Roating number validation logic
                                routing_num_count = 0
                                number = 0
                                print("count",routing_num_count)
                                while(routing_num_count < len(value)):
                                    number += int(value[routing_num_count]) * 3 + int(value[routing_num_count+1]) * 7 + int(value[routing_num_count+2])
                                    routing_num_count= routing_num_count +3
                                    print("Count value is",routing_num_count)
                                print("The value of number is : ",number,type(number))
                                if(number !=0 and number%10 ==0):
                                   print("Routing Number check validated successfully")
                                else:
                                   dispatcher.utter_message("Invalid Routing number. Please check and enter valid number.")
                                   print("Routing Number check validation fail")
                                   dispatcher.utter_template('utter_ask_routingnumber', tracker)
                                   slot_values[slot] = None  
                            else:
                               dispatcher.utter_message("Routing Number must be 9 digits.")
                               print("Roating number is non numeric")
                               dispatcher.utter_template('utter_ask_routingnumber', tracker)
                               slot_values[slot] = None    
                    else:
                            dispatcher.utter_message("Routing Number must be numeric.")
                            print("Roating number is non numeric")
                            dispatcher.utter_template('utter_ask_routingnumber', tracker)
                            slot_values[slot] = None 
                     

              elif slot == 'accountno':
                    print("value for account number is",value,type(value))
                    ##In story roating number is defined as unfeaturized so we are converting it before     
                    if (value.isnumeric()):
                       print("Account number is numeric")
                       if (len(value) > 5 or len(value) < 17):
                           print("Account number length matches")
                       else:
                           dispatcher.utter_message("Account Number must be at least 5 digits and max 17 digits")
                           print("Account number length not valid")
                           dispatcher.utter_template('utter_ask_accountno', tracker)
                           slot_values[slot] = None 
                    else:
                           dispatcher.utter_message("Account Number must be numeric.")
                           print("Account number is not numeric")
                           dispatcher.utter_template('utter_ask_accountno', tracker)
                           slot_values[slot] = None 


                    
                        
          return [SlotSet(slot, value) for slot, value in slot_values.items()]       
    
    def submit(self,dispatcher: CollectingDispatcher,tracker: Tracker,domain: Dict[Text, Any],) -> List[Dict]:
            """Define what the form has to do after all required slots are filled"""
            print("******Submitted button activated**********")
            # utter submit template
            accholdername= tracker.get_slot('accholdername')
            routingnumber= tracker.get_slot('routingnumber')
            accountno= tracker.get_slot('accountno')
            print("accholdername",accholdername)
            print("routingnumber",routingnumber)
            print("accountno",accountno)
            return[]

Thanks in advance and expecting early reply

Can any one please reply

It doesn’t look like you’ve tagged any of your entities in NLU:

e.g., you have:
account no is 7819019 and routing num 333743123

and you should have something like:
account no is [7819019](accountno) and routing num [333743123](routingnumber)

Also, setup regex patterns for each of the entities to help differentiate.

BTW, if you have two or more entities that have the same format (e.g., 9 digit number) and you expect that the user may enter it by itself (like you have in your examples), you might want to define it as a more general entity like ‘number’ and then add some code in your action server to figure out what the actual entity is based on what the bot just said. Then you can set the slot as appropriate. e.g.,

[123456789](number)

OTOH, if it's a sample of just one legal format, then you can label it correctly:

[12345](accountno)

Again, make sure to set up the regex patterns to help it out.

1 Like

Thanks for your response

  1. I mentioned entity in my nlu data that i posted above but here it is showing it as highlighted string if you see instead of entity and value i have a training data in my NLU as below: EX: My account number is 123654 (account_no) and my routing number is 123123123 (routing_number)

but what my problem is some time if i give any other values in the above format account_no is taking routing value and routing_number is taking account number value since both are numeric because of this slots get set with wrong values.

  1. I cannot use regex pattern for these because i have to validate the values user entered and if user entered any wrong value i should reply him with what is wrong in the value entered and if i use regex it ends conversion there itself if user enters alphanumeric for numeric value.

BTW, if you have two or more entities that have the same format (e.g., 9 digit number) and you expect that the user may enter it by itself (like you have in your examples), you might want to define it as a more general entity like ‘number’ and then add some code in your action server to figure out what the actual entity is based on what the bot just said. Then you can set the slot as appropriate.

I did not understand above one correctly can you please explain me with small example

Ah, format your text as “Preformatted text” (the </> icon above) and it will show in the original format. Just like u did for your Python code. Then repost your original examples so I can see how it’s being marked up.

Then provide a specific example of what it’s confusing.

NLU DATA:

  • [123456789](accountno)
  • [12345](accountno)
  • [1234512345](accountno)
  • [98765432](accountno)
  • [7865432](accountno)
  • [6576987](accountno)
  • [543298098](accountno)
  • [123123123](routingnumber)
  • [322284892](routingnumber)
  • [102103119](routingnumber)
  • [101000187](routingnumber)
  • [101200453](routingnumber)
  • [102000021](routingnumber)
  • [102101645](routingnumber)
  • the account number is [12345](accountno)
  • account no is [12349](accountno)
  • acc number is [9991119](accountno)
  • acc no is [78908794](accountno)
  • the account number is [7284578](accountno)
  • account no is [9879049](accountno)
  • acc number is [9871119](accountno)
  • acc no is [76701799](accountno)
  • the account number is [52345](accountno)
  • account no is [62741](accountno)
  • acc number is [67891119](accountno)
  • acc no is [12389087](accountno)
  • the account number is [42345](accountno)
  • account no is [82249](accountno)
  • acc number is [5691122](accountno)
  • acc no is [5608751](accountno)
  • the routing number is [123123123](routingnumber)
  • routing no [104000029](routingnumber)
  • routing num [107001452](routingnumber)
  • the routing number is [107002312](routingnumber)
  • routing no [121122676](routingnumber)
  • routing num [121201694](routingnumber)
  • the routing number is [122235821](routingnumber)
  • routing no [122105155](routingnumber)
  • routing num [122238682](routingnumber)
  • the routing number is [122401781](routingnumber)
  • routing no [123000220](routingnumber)
  • routing num [123000848](routingnumber)
  • account no and routing num are [9879049](accountno),[123103729](routingnumber)
  • account no is [7819019](accountno) and routing num [124302150](routingnumber)
  • account number is [1811019](accountno) and routing number [125000105](routingnumber)
  • acc number is [7877019](accountno) and routing number is [273970514](routingnumber)
  • account no and routing num are [123904](accountno),[307070115](routingnumber)
  • account no is [601945](accountno) and routing num [322270356](routingnumber)
  • account number is [1711989](accountno) and routing number [322270495](routingnumber)
  • acc number is [581020](accountno) and routing number is [322284892](routingnumber)
  • account no and and routing num are [1119060](accountno),[122038442](routingnumber)
  • account no is [580030](accountno) and routing num [122187160](routingnumber)
  • account number is [3331020](accountno) and routing number [122212611](routingnumber)
  • acc number is [451115](accountno) and routing number is [122226937](routingnumber)
  • account no and and routing num are [49150](accountno),[122228812](routingnumber)
  • account no is [113456](accountno) and routing num [122234822](routingnumber)
  • account number is [781056](accountno) and routing number [122240502](routingnumber)
  • acc number is [187103](accountno) and routing number is [124103760](routingnumber)
  • [nayana](accholdername)
  • [peter](accholdername)
  • [jack](accholdername)
  • [christan](accholdername)
  • [rob](accholdername)
  • [steve](accholdername)
  • [white](accholdername)
  • [robin](accholdername)
  • [smith](accholdername)
  • [gupta](accholdername)
  • [Adam](accholdername)
  • [Andrewe](accholdername)
  • [Anthonye](accholdername)
  • [Archbould](accholdername)
  • [Arthure](accholdername)
  • [Ambrose](accholdername)
  • [Alexander](accholdername)
  • [Barthe](accholdername)
  • [Bartram](accholdername)
  • [Bryan](accholdername)
  • [Andro](accholdername)
  • [Anthony](accholdername)
  • [Anthonie](accholdername)
  • the account holder name is [john](accholdername)
  • the name of the user is [peterson](accholdername)
  • my name is [kevin john](accholdername)
  • the account holder name [alisa peter](accholdername)
  • the account holder name is [surya](accholdername)
  • the account holder name is [gilchrist](accholdername)
  • the account holder name is [warner](accholdername)
  • the account holder name is [mikka](accholdername)
  • the account holder name is [todd](accholdername)
  • the account holder name is [mark](accholdername)
  • the account holder name is [randy](accholdername)
  • the account holder name is [katrina](accholdername)
  • [bharat](accholdername)
  • [kajal](accholdername)
  • [iswar](accholdername)
  • [paul chandrapal](accholdername)
  • [jondy rodes](accholdername)
  • my name is [katrina](accholdername)
  • account holder name [divya](accholdername)
  • the account holder name is [karthik](accholdername)
  • account holder name [pavan](accholdername)
  • my name is [robin](accholdername)
  • the account holder name is [bobby](accholdername)
  • account holder name [jack](accholdername)
  • account holder name [robinhood](accholdername)
  • my name is [steve job](accholdername)
  • the account holder name is [steve solmon](accholdername)
  • my name is [ranabir](accholdername)
  • the account holder name is [ranavir](accholdername)
  • account holder name [john abraham](accholdername)
  • account holder name [sachin](accholdername)
  • the account holder name [alisa peter](accholdername)
  • the account holder name is [john](accholdername)
  • the account holder name is [snow](accholdername)
  • the account holder name is [lanister](accholdername)
  • the account holder name is [stark](accholdername)
  • the account holder name is [bill gates](accholdername)
  • the account holder name is [steve jobs](accholdername)
  • the account holder name is [sangakara](accholdername)
  • the account holder name is [malinga](accholdername) This is my DATA in NLU what my doubt is that some times account number is recognizing as routing number and routing number as account number for different examples which is not in NLU and set the slots with wrong entities in forms because of this slot setting step is skipped Can you please help me how can i extract correct entities if entities are numbers

Doubt:

some code in your action server to figure out what the actual entity is based on what the bot just said. Then you can set the slot as appropriate

I did not understand how can i figure out what actual entity from custom action can you please explain me with small example if possible can you please share me the some reference code

And also can you please let me know how can i get free text in rasa forms i am using self.from_text() in my actions it also predicts some probability with nlu intents i need an action which get values from user with out depending on intents

Note: Here account number and routing number must be numeric and length of account number should be between 5 to 17 and routing number is 9

I asked if you could include some examples where it get the entities wrong. I don’t see any of those.

I’m assuming that if a user says “routing number is 121122676” then it correctly will correctly identify the number as the routingnumber. It looks like you have enough training data for that. I’m also assuming that the problem occurs when the user just enters a number. e.g., “121122676”.

In this case, it’s not really possible for anyone to know whether it’s a routingnumber or accountno, right? What I was suggesting was that you define an accountno regex as \d{5,17} and routingnumber to \d{9}. This will at least reduce the confusion for numbers that aren’t 9 digits. However, there can still be ambiguity if it’s exactly 9 digits.

But you did also mention that you thought that users may enter the wrong number of digits and want to catch those errors. In that case, you really can’t use regex.

Assuming the problem is just standalone numbers, I’d just tag everything as accountno.

"[12345689](accountno)"
"[83273](accountno)"

Then in your validate_accountno code, figure out whether the bot was asking for for account or routing number. (see next response)

OK, I just checked.

in your validate_accountno code:

tracker.latest_message.text contains what the user just entered.

To get the current bot context (the last thing your bot asked), you can look at tracker.events. tracker.events[-1] is what the user just entered tracker.events[-2] is the action_list tracker.events[-3] is what you want to look at. It will look something like this:

{'event': 'slot', 'timestamp': 1569887182.296513, 'name': 'requested_slot', 'value': 'routingnumber'}

Here you can see that event type is ‘slot’ (asking for a form’s slot value) and the specific slot is ‘routingnumber’.

That means you should set the routingnumber slot to this value and not set the value of account_no.

There may be a more straightforward way of doing this, but that’s what I’ve done in my code for a similar problem.

Thanks for your response

I’m assuming that if a user says “routing number is 121122676” then it correctly will correctly identify the number as the routingnumber. It looks like you have enough training data for that. I’m also assuming that the problem occurs when the user just enters a number. e.g., “121122676”

Not only when user enters only value but also when ever user gives my routing number is 123123123 and account number as 123456789 in this case also some times routing number slot is set with 123456789 and account number slot with 123123123 (this is just an example)

This is just a guess as I’m not an expert in this stuff…

I think the problem is in these types of training examples:

account no and routing num [9879049](accountno),[123103729](routingnumber)

The component CRFEntityExtractor is responsible for extracting the entities and assumedly it has some window of tokens on either side to help it in extraction. Let’s say that’s +/- 2 tokens.

In the above example, it would see ‘routing’, ‘num’ before the 9879049. It would not see ‘account’, ‘no’, ‘and’, ‘routing’, ‘num’.

So it doesn’t build the association you want, which is why it gets confused at runtime.

(Again, apologies if I’m wrong.)

To test this, remove your training examples that have this format (fwiw, this seems like an unnatural way for users to state those two numbers. seems like ‘my account number is 123 and routing number is 345’ is more natural than ‘my account number and routing number are 123, 345’).

Retrain and try again. If it works, then somehow you have to get the CRFEntityExtractor to use a larger window. Maybe try MitieCRMExtrator instead? There are hackier things like rewriting sentence, but probably don’t want to get into that.

And, of course, you could just not support this particular weird sentence structure and move on with your project. :wink:

Thanks

Can you please let know me how MitieCRMExtrator is different from CRFEntityExtractor

I don’t know what the functional difference is (I’ve only been using Rasa for a month or so). I think the most important thing is to determine if the intuition is correct. If it is, then we can ask a new question about how to incorporate a larger window in the entity extraction process. If not, then we need to continue to dig into the problem.

Thanks i have train the model by deleting the type of examples you mentioned but now when i give input as my group number is A07471 and TAX ID is 274450731 here it is only extracting entity of group number (A07471) and but it is not extracting tax id (274450731)

Can you please let me know why it is only extracting only group number leaving tax id

NLU DATA:

  • my group number [A07471](group_num_form) and tax-id is [274950677](tax_id)

  • group number is [G09401](group_num_form) and tax id is [278789077](tax_id)

  • the grp number for my plan is [O09911](group_num_form) and tax id is [678789811](tax_id)

  • tax id is [274950677](tax_id)

  • the tax id for my group number is [571952676](tax_id)

  • my tax id is [411751881](tax_id)

  • tax id [630170819](tax_id)

  • the tax id for my group number is [331470899](tax_id)

  • tax id is [611974874](tax_id)

  • my tax id is [510370611](tax_id)

  • my tax id is [831450835](tax_id)

  • tax id [510680191](tax_id)

  • tax id is [123479835](tax_id)

  • my tax id [504410110](tax_id)

  • tax is for my account [157508110](tax_id)

  • tax id for my plan is [248501011](tax_id)

  • tax id for my plan is [339510530](tax_id)

  • tax id for my account [420150362](tax_id)

  • my tax id is [531750813](tax_id)

  • my group number [H01471](group_num_form) and tax-id is [171980617](tax_id)

  • group number is [L09401](group_num_form) and tax id is [970719007](tax_id)

  • the grp number for my plan is [O01991](group_num_form) and tax id is [178189111](tax_id)

  • my group number [Z01411](group_num_form) and tax-id is [188950311](tax_id)

  • group number is [K01101](group_num_form) and tax id is [138119071](tax_id)

  • the grp number for my plan is [B01771](group_num_form) and tax id is [118181811](tax_id)

  • my group number [I12111](group_num_form) and tax-id is [114951681](tax_id)

  • group number is [Y09111](group_num_form) and tax id is [218587169](tax_id)

  • the grp number for my plan is [R11811](group_num_form) and tax id is [171581855](tax_id)

  • my group number [A07475](group_num_form) and tax-id is [374950677](tax_id)

  • group number is [G09405](group_num_form) and tax id is [378789077](tax_id)

  • the grp number for my plan is [O09955](group_num_form) and tax id is [678789855](tax_id)

  • tax id is [771980677](tax_id)

  • the tax id for my group number is [878987676](tax_id)

  • my tax id is [188781881](tax_id)

  • tax id [670870889](tax_id)

  • the tax id for my group number is [778170899](tax_id)

  • tax id is [688971871](tax_id)

  • my tax id is [180770688](tax_id)

  • my tax id is [878180878](tax_id)

  • tax id [880680898](tax_id)

  • tax id is [877179878](tax_id)

  • my tax id [801110880](tax_id)

  • tax is for my account [187808880](tax_id)

  • tax id for my plan is [718808088](tax_id)

  • tax id for my plan is [779880870](tax_id)

  • tax id for my account [170880767](tax_id)

  • my tax id is [878780187](tax_id)

  • my group number [H08178](group_num_form) and tax-id is [878980687](tax_id)

  • group number is [L09108](group_num_form) and tax id is [970789007](tax_id)

  • the grp number for my plan is [O08998](group_num_form) and tax id is [878889888](tax_id)

  • my group number [Z08188](group_num_form) and tax-id is [888980788](tax_id)

  • group number is [K08808](group_num_form) and tax id is [878889078](tax_id)

  • the grp number for my plan is [B08778](group_num_form) and tax id is [888888888](tax_id)

  • my group number [I87888](group_num_form) and tax-id is [881988688](tax_id)

  • group number is [Y09888](group_num_form) and tax id is [788887869](tax_id)

  • the grp number for my plan is [R88888](group_num_form) and tax id is [878888888](tax_id)

@Rasa @erohmensing @souvikg10 Any one from RASA team can help me to find solution for the above problem

Look at your debug output, you did not enter “my group number is A07471 and TAX ID is 274450731” … instead you entered “my group number is A07471 and group number is 274450731” (you specified group number twice).

Also, keep in mind that it’s case-sensitive, so make sure to mix case in examples

Sorry wrong screen shot attached see the below screen shot in which i give both group number and tax id but it only considering group number leaving tax id and it ask again me to enter tax id

Also can you please tell me where can i get all the custom rasa action list like tracker.latest_message.text

Can you please let me know how to choose values for memorization and keras policy the model is behaving differently for different values (Some stories are not working for some values) So can you please let me know how can i decide what is the optimum value so that all the stories in model work fine

In your screenshot, you can see that NLU is extracting both the group_num_form and tax_id entities… look at the last line before the slot values.

So the problem is just assigning those entity values to the slots. In your code:

@staticmethod
def required_slots(tracker: Tracker) -> List[Text]:
        """A list of required slots that the form has to fill"""
        print("*************required slots form has to fill started 2****************")
        return ["accholdername","routingnumber","accountno"]

You are not including tax_id as a form slot.

As far as where to get access to tracker… you can get tracker from your validate_slots() function. You are already accessing tracker in your code.