Hello! I’m experimenting with the CALM demo bot and I found an issue. I provide cohere.embed-multilingual-v3
via Bedrock as an embedding model and when I run rasa train
with it and the default config.yml
from the demo repository (I also changed the LLM component to one of the bedrock models), I get the following error during the training of the IntentlessPolicy
:
2025-05-05 15:57:49 ERROR rasa.core.policies.intentless_policy - [error ] intentless_policy.train.llm.error error=ProviderClientAPIException('Failed to embed documents\nOriginal error: litellm.BadRequestError: BedrockException - {"message":"Malformed input request: #/texts: expected maximum item count: 128, found: 300, please reformat your input and try again."})')
I looked deeper into it and indeed, while performing an API call for the embedding model there is a list of 300 phrases that is provided to the embedding model - from what I understood these phrases are taken from domain/nlu_based
, domain/_shared.yml
, domain/search
. Seems like most of them come from the Squad dataset domain/search/squad.yml
.
I also tried with amazon.titan-embed-text-v2:0
embedding model on Bedrock and it seems like LiteLLM handles these models differently:
- if I pass a list of phrases to
litellm.embedding(model="amazon.titan-embed-text-v2:0", ...)
, under the hood it will make a separate request for each phrase. - while for
cohere.embed-multilingual-v3
, it will only make one request - and there is a upper limit on the number of elements to embed.
I just wanted to bring your attention to this particularity of cohere
and maybe other embedding models that may affect embedding-related components - it may be worth it to split all documents into batches in embed()
of shared.providers.embedding.embedding_client.EmbeddingClient
.
Also, a question: am I understanding correctly that during the training IntentlessPolicy
is importing all the responses
in the domain yaml files that are in no flow?
Thank you!