I have to use my custom model architecture in rasa keras policy, for which I have to understand the prediction probabilities of rasa keras policy. As I have understood, while training the training data has first changed into vector transformation, and then a random shuffle has performed on the inputs and the predicted actions.
The shuffled inputs and predicted actions have then used to train the model using simple LSTM.
Could I get the clear explanation about why the vactor transformations have performed? How does it have influenze on predicting actions?
the stories need to be transformed into numbers, that’s why featurization is performed. shuffling of data points is a standard procedure before training ML models
Thanks for the Reply! Could you please elaborate it. Have the vector representations implemented based on starspace algorithm or what type of featurizer has used to convert the samples into vector representations?