Hello Community !! I have a perfectly working RASA Pro CALM (v3.8.7) chatbot which is dependent on OpenAI for tasks such as ‘Enterprise Search Policy’, ‘Intentless Policy’, ‘Command Generator’, ‘Contextual Response Rephraser’ etc. Now that i need to make the chatbot completely independent, I need to use a local LLM. (Preferably via Ollama or any similar setup). I tried doing that but I wasn’t successful. I will share my previous config setup which I used with GPT and the new setup in which I’m trying to use the OpenAI entry point to use the local Ollama model. Kindly help me with the right configuration and clear procedure of integrating local LLM with RASA Pro CALM which can replace OpenAI in every way. Any help in regarding this is highly appreciated.
Thank you for your response. I will try and get back to you about this. Meanwhile, could you please let me know the embedding provider that you are using ? If possible, please do share the config and endpoint configuration here. Many thanks in advance.
Hi,
For this specific example I didn’t configure embeddings, thus I used the default openai. You don’t need to configure the endpoint for ollama – the default localhost will be used. Check out the documentation here to see how to configure embeddings.
I checked your config a little bit closer and there is one issue with it:
you shouldn’t configure llm and embeddings for FlowPolicy. It doesn’t use an LLM at all.
The llm configuration for EnterpriseSearchPolicy should be similar as for the CommandGenerator.
Hello @Balowen i dont have access to open ai and I want a local llm. I am struggling to find a way on how to implement this. My data base is like a db for which i believe flow should work perfectly but I am not able to run since openAI is not allowed and blocked.
Thanks in advance for help.
My bot is using following components.
LLMCommandGenerator, FlowPolicy, IntentlessPolicy, EnterpriseSearchPolicy, rasa.core.ContextualResponseRephraser. Could you tell me which of these require LLM configuration and which of these demand Embedding configuration ?
I have been trying ever since. The ollama did not seem to work for me with the configuration that you had mentioned. By the way, can you confirm if RASA expects embeddings in a fixed length or format ? Thanks a lot !!
@Sanjukta.bs
If you have access to a powerful machine you could try CALM with a local llm like llama3. The easiest way to run it is with ollama. You need to be aware though that models like llama 8B aren’t as powerful as gpt3.5 or gpt4 and won’t perform well in command generation, which is essential to understand the user and trigger correct flows.
The components in Rasa Pro that require LLM configuration:
Refer to this page to check the supported embeddings providers: LLM Providers.
For EnterpriseSearch, if Rasa doesn’t natively support a particular embedding model that you want to use, custom information retrieval comes to the rescue. You can integrate local or fine-tuned embedding models of your choice to generate embeddings for search queries and documents.
Could you share your whole config, how are you running the ollama and any error logs when running rasa with --debug?