I have to use my custom model architecture in rasa keras policy, for which I have to understand the prediction probabilities of rasa keras policy. As I have understood, while training the training data has first changed into vector transformation, and then a random shuffle has performed on the inputs and the predicted actions.
The shuffled inputs and predicted actions have then used to train the model using simple LSTM. Could I get the clear explanation about why the vactor transformations have performed? How does it have influenze on predicting actions?