Using the Tracker Store changes the policy

Hello! I am currently working on a project writing a chatbot in Rasa. We want to save the data in a Mongo Database but first, we want to anonymize the data. Therefore I built a custom tracker store and everything works fine. Except that it changes the policy from the MemorizationPolicy to the TEDPolicy when I implement my Tracker Store. I have no clue why it does it. Is there maybe something I missed? Thank you for your answers.

Hi @user429, it sounds like this might be something in your tracker store implementation - if it’s changing the tracker of your conversations at all, it could change the next prediction. A first debugging step might be to take a specific convo where this is happening, check the state of the tracker before it hits the tracker store anonymization logic, and then after it is retrieved and see what the difference is.

Thanks, @mloubser I created an easy conversation with only asking the name. When the user types in “Hi” he/she is asked to put his/her name in. When I compare the tracker stores I can’t see a convenient difference. But here are the retrieved and saved trackers. Maybe you are able to tell what’s wrong. This is my source code. Did I miss something or did wrong?

from abc import ABC
from rasa.core.tracker_store import TrackerStore
from rasa.core.trackers import DialogueStateTracker, EventVerbosity
from pymongo import MongoClient


# Function to change the age of the user
def change_age(age: float, distance: int):
    for i in range(1, distance + 1):
        if (age + i) % distance == 0:
            return age + i


def anonymize(data: dict, delete_entities: list, delete_slots: list):
    for i in data["slots"].keys():
        if i in delete_slots:
            data['slots'][i] = None

    for j in data['latest_message']:
        if j == 'entities':
            for entity in range(0, len(data['latest_message'][j])):
                if data['latest_message'][j][entity].get('entity') == 'age':
                    data['latest_message']['text'] = data['latest_message']['text'].replace(
                        change_age(float(data['entity'].get('value')), 3))
                else:
                    data['latest_message']['text'] = data['latest_message']['text'].replace(
                        data['latest_message']['entities'][entity]['value'], '')
                data['latest_message']['entities'][entity]['value'] = None

    for event in range(0, len(data['events'])):
        if data['events'][event].get('event') == 'user':
            for entity in range(0, len(data['events'][event]['parse_data'].get('entities'))):
                if data['events'][event]['parse_data']['entities'][entity]\
                        .get('entity') in delete_entities and data['events'][event]['parse_data']['entities'][entity].\
                        get('value') is not None:
                    data['events'][event]['text'] = data['events'][event]['text'].\
                        replace(data['events'][event]['parse_data']['entities'][entity].get('value'), '')
                elif data['events'][event]['parse_data']['entities'][entity].get('entity') == 'age':
                    data['events'][event]['text'] = data['events'][event]['text'].\
                        replace(change_age(float(data['events'][event]['parse_data']['entities'][entity]
                                                 .get('value')), 3))
                data['events'][event]['parse_data']['entities'][entity]['value'] = None
                data['events'][event] = data['events'][event]['parse_data'].pop('text')
        elif data['events'][event].get('event') == 'slot':
            if data['events'][event].get('name') in delete_slots:
                data['events'][event] = data['events'][event].pop('value')
        elif data['events'][event].get('event') == 'bot':
            data['events'][event] = data['events'][event].pop('text')

    return data


class Tracker(TrackerStore, ABC):
    def __init__(self, domain,
                 url,
                 database: str,
                 collection: str,
                 db_url: str,
                 db_port: int,
                 event_broker=None):
        self.domain = domain
        self.url = url
        self.event_broker = event_broker

        self.conn = MongoClient(db_url, db_port)
        db = self.conn[database]
        self.collection = db[collection]

        TrackerStore.__init__(self, domain, event_broker)

    def save(self, tracker):
        if self.event_broker:
            self.stream_events(tracker)

        state = tracker.current_state(EventVerbosity.ALL)
        data = anonymize(state, ['name'], ['name'])

        self.collection.update_one({"sender_id": tracker.sender_id}, {'$set': data}, upsert=True)

    def retrieve(self, sender_id):
        stored = self.collection.find_one({"sender_id": sender_id})

        if stored is None and sender_id.isdigit():
            from pymongo import ReturnDocument

            stored = self.collection.find_one_and_update({'sender_id': int(sender_id)},
                                                         {'$set': {'sender_id': str(sender_id)}},
                                                         return_document=ReturnDocument)

        if stored is not None:
            if self.domain:
                return DialogueStateTracker.from_dict(sender_id, stored.get('events'), self.domain.slots, 150)
            else:
                return None
        else:
            return None

    def __del__(self):
        self.conn.close()

Thanks for the source code - I don’t see the trackers anywhere though? If you have slots that are set in your stories, and you’re unsetting them using your anonymization function, that could cause the policy difference, but it would depend on the specific stories. What do your stories (just the ones relevant to what you’re trying) look like? And what slots are you setting/deleting?

Hello @mloubser. Thanks for your answer. I’ve created a little example program to test this tracker store. The stories.md looks like this:

## happy path
* greet
      - utter_greet
* ask_name
     - utter_greetname

And the domain.yml looks like this:

intents:
    - greet
    - ask_name
entities:
    - name
    - age

slots:
    name:
        type: text

responses:
    utter_greet:
        - text: "Hey! Can you tell me your name?"

    utter_greetname:
        - text: "Hello {name}"

session_config:
session_expiration_time: 60
carry_over_slots_to_new_session: true

What do you mean that there you can’t see the tracker? Did I implemented the Tracker Store wrong?

[EDITED] I commented the anonymization of the slots out and tried it again and got the same error.

Sorry should have clarified - I meant could you send the literal tracker (how it comes back in the logs) for with and without the custom tracker store.