How Rasa keras policy perform prediction probabilities?

I have to use my custom model architecture in rasa keras policy, for which I have to understand the prediction probabilities of rasa keras policy. As I have understood, while training the training data has first changed into vector transformation, and then a random shuffle has performed on the inputs and the predicted actions.

The shuffled inputs and predicted actions have then used to train the model using simple LSTM. Could I get the clear explanation about why the vactor transformations have performed? How does it have influenze on predicting actions?

what do you mean by vector transformation?

After analysing the script, in the function ‘train’ of KerasPolicy, the ‘training trackers’ has transformed into a vector representation using featurizer (rasa/ at master · RasaHQ/rasa · GitHub).

Please note the function ‘featurize_for_training’(rasa/ at 510d0fd8b77731d225d0d7c48f1cea5842924162 · RasaHQ/rasa · GitHub).

This vector representations have shuffed using numpy, which has defined as shuffled_X, shuffled_y, which have then used for training the model.

the stories need to be transformed into numbers, that’s why featurization is performed. shuffling of data points is a standard procedure before training ML models

Thanks for the Reply! Could you please elaborate it. Have the vector representations implemented based on starspace algorithm or what type of featurizer has used to convert the samples into vector representations?

one-hot encoding of intents and actions