the story during prediction time probably contain more history than this snippet, so it doesn’t truly correspond to memorized version. Memoization policy is not there to be the main policy. It exists to ensure that if conversation goes exactly as training data, it will be followed, but training data contains mistakes, so it is ok to rely on ted predictions