Question about embedding policy architecture

Hello. I read paper about REDP (https://arxiv.org/pdf/1811.11707.pdf). According to section 3 RNN part, the output of lstm is fed to embedding layer and then the sum of it and system attention vector is used as dialogue state embedding.

(The output of this cell is fed to another embedding layer to create an embedding of the cell output for the current time step. The sum of this embedded cell output and system attention vector is used as the dialogue state embedding.)

However, in the Figure 2, the output of lstm is added to system attention vector and then fed to embedding layer, which is the purple box on the far right.

Am i misunderstanding something?

Thanks in advance.

you’re right. It is a mistake on the scheme

ah… okay thanks!

What are written to the user and system memory in the photos? Are these just the embedding vectors for system actions and user inputs in chronological order per each story? Then the attention mechanism produces another vector to identify which parts to ignore and which parts to pay attention to?

yes