LLM command generator looking for openai license even with llama model

recipe: default.v1
language: en
pipeline:
- name: LLMCommandGenerator
  llm:
    type: "llamacpp"
    model_path: "llms/llama-2-7b-chat.Q5_K_M.gguf"
    temperature: 0.7

policies:
- name: FlowPolicy

#  - name: EnterpriseSearchPolicy
#  - name: RulePolicy
assistant_id: 20240408-143016-huge-marble

This is my config.yml modified for the tutorial example. However, when I run rasa train, I get the following error.

2024-04-08 14:49:59 INFO     rasa.dialogue_understanding.generator.llm_command_generator  - [info     ] llm_command_generator.flow_retrieval.enabled
Traceback (most recent call last):
  File "/Users/vivek/Documents/rasaenv/lib/python3.10/site-packages/rasa/engine/graph.py", line 526, in __call__
    output = self._fn(self._component, **run_kwargs)
  File "/Users/vivek/Documents/rasaenv/lib/python3.10/site-packages/rasa/dialogue_understanding/generator/llm_command_generator.py", line 213, in train
    self.flow_retrieval.populate(flows.user_flows, domain)
  File "/Users/vivek/Documents/rasaenv/lib/python3.10/site-packages/rasa/dialogue_understanding/generator/flow_retrieval.py", line 181, in populate
    embeddings = self._create_embedder(self.config)
  File "/Users/vivek/Documents/rasaenv/lib/python3.10/site-packages/rasa/dialogue_understanding/generator/flow_retrieval.py", line 153, in _create_embedder
    return embedder_factory(
  File "/Users/vivek/Documents/rasaenv/lib/python3.10/site-packages/rasa/shared/utils/llm.py", line 245, in embedder_factory
    return embeddings_cls(**parameters)
  File "/Users/vivek/Documents/rasaenv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for OpenAIEmbeddings
__root__
  Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass  `openai_api_key` as a named parameter. (type=value_error)

Hi @MannavaVivek,

The issue you are seeing is caused by a braking change introduced in Rasa Pro 3.8.0. This release introduces a new feature flow-retrieval which ensures that only the flows that are relevant to the conversation context are included in the prompt sent to the LLM in the LLMCommandGenerator . This helps the assistant scale to a higher number of flows and also reduces the LLM costs.

This feature is enabled by default and we recommend to use it if the assistant has more than 40 flows. By default, the feature uses embedding models from OpenAI, but if you are using a different provider (for e.g. Azure), please ensure -

  1. An embedding model is configured with the provider.
  2. LLMCommandGenerator has been configured correctly to connect to the embedding provider. For example, see the section on configuration required to connect to Azure OpenAI service

If you wish to disable the feature you can configure the LLMCommandGenerator as:

config.yml

pipeline:
  - name: LLMCommandGenerator
    ...
    flow_retrieval:
      active: false
    ...

Check the migration guide to learn more on what changed with Rasa Pro 3.8.0:

1 Like

setting the flow_retrieval to false works, but if its set to true, and I set a different embedding provider, it seems to still look for OpenAI. Here is my config for the LLMCommandGenerator

pipeline:
- name: LLMCommandGenerator
  llm:
    type: "cohere"
    model: "command-r-plus"
    temperature: 0.8
  embeddings:
    type: "cohere"
    model: "embed-english-light-v3.0"

I dont think that configurations looks quite right. Try something like: pipeline:

  • name: LLMCommandGenerator llm: type: “cohere” model: “command-r-plus” temperature: 0.8 flow_retrieval: embeddings: type: cohere model: embed-english-light-v3.0
1 Like

I am getting this error Retrying langchain.llms.openai.acompletion_with_retry.._completion_with_retry in 8.0 seconds as it raised APIConnectionError: Error communicating with OpenAI. , if I turn on flow_retrieval and then turned off

Where as my standlaone opeai test code works fine

completion = openai.ChatCompletion.create( model=“gpt-3.5-turbo”, messages=[ {“role”: “system”, “content”: “You are a poetic assistant, skilled in explaining complex programming concepts with creative flair.”}, {“role”: “user”, “content”: “Compose a poem that explains the concept of recursion in programming.”} ] )

print(completion.choices[0].message)

Hi Geeta,

The flow_retrieval embeddings settings need to be under that key. For example:

- name: LLMCommandGenerator
  llm:
    type: "cohere"
    model: "command-r-plus"
    temperature: 0.8
  flow_retrieval:
    embeddings:
      model: "embed-english-light-v3.0"
      type: "cohere"

Let us know if that helps!

Thanks Chris. I could get this thing working with GPT-3.5. But what I notce is even for intent like hello where confidence is 00.9999 parse_data_entities= parse_data_intent={‘name’: ‘greet’, ‘confidence’: 0.9999834299087524} parse_data_text=Hello

Somehow Intentless policy is getting triggered 2024-05-08 21:56:11 DEBUG rasa.dialogue_understanding.processor.command_processor - [debug ] command_processor.clean_up_commands.prepend_command_chitchat_answer command=ChitChatAnswerCommand() defined_intentless_policy_in_config=True pattern_chitchat_uses_action_trigger_chitchat=False

My COnfig.yaml has these policies policies:

  • name: FlowPolicy confidence_threshold: 0.7
  • name: IntentlessPolicy nlu_threshold: 0.4

So bot’s behaviour has become unpredictable now. Kindly please suggest what can be done on this

Thanks, Geeta

Indeed that’s it, because rasa uses openai embedding if no model specified