Hi, I have created a ‘sentence-similarity’ model using ‘sentence_transformers’.
I want to use this model in my pipeline as a custom Featurizer
and create embeddings for each sentence, then combine those vectors with vectors produced using CountVectorsFeaturizer
(I read somewhere it’s good to use special words of our domain using CountVectorFeaturizer
).
I have WhitespaceTokenizer
and DIETClassifier
in my pipeline and I want to use my custom component between them but I read that, DIETClassifier
gets embeddings for each word and doesn’t get embedding of the whole sentence.
I’m stuck and I have no idea what to do. How can I implement my Featurizer
(I am working with rasa 2.3.1)? Should I change my tokenizer so that it tokenizes sentences? And can I concatenate sentence embedding from sentence transformer and word embedding from CountVectorFeaturizer
and feed them to DIETClassifier
? If not, which intent classifier and entity extractor I can use instead?
I appreciate any help