How to use Ollama models in Rasa CALM?

vishnupriyavr · May 1, 2024, 3:22pm

Hello, I am trying to use models served through Ollama into Rasa CALM? I could not find it anywhere in the documentation, any leads or tips would be helpful

camattin · May 1, 2024, 4:22pm

Hi Vishni,

The easiest way I’ve found to use Ollama (presuming you’re running the Ollama server locally) is to make use of its OpenAI-compatible entry point.

Your config would look similar to:

pipeline:
- name: LLMCommandGenerator
  llm:
    model: "wizardlm2:7b"
    max_tokens: 20
    type: openai
    openai_api_base: http://localhost:11434/v1
    openai_api_key: foobar

And, of course, you can replace the value for model with whichever model you have downloaded to your Ollama installation.

Hope this helps!

vishnupriyavr · May 2, 2024, 3:26am

Hi Chris, Thanks for the quick response! Sure, this is of great help, will try it out.

MannavaVivek · May 13, 2024, 3:36pm

2024-05-13 17:28:19 WARNING  langchain.llms.base  - Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=7).
2024-05-13 17:28:34 WARNING  langchain.llms.base  - Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 10.0 seconds as it raised Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=7).
2024-05-13 17:28:51 ERROR    rasa.utils.log_utils  - [error    ] llm_command_generator.llm.error error=Timeout(message="Request timed out: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=7)", http_status=None, request_id=None)

I tried to use Ollama as instructed, and it threw me this error. I suspect its the same issue discussed here and according to the status of the issue, doesn’t seem to be getting a fix anytime soon. So can anyone suggest alternatives for running local models?

camattin · May 13, 2024, 6:03pm

Hi Vivek,

Are you able to access your locally running Ollama server with curl? e.g.

curl http://localhost:11434/api/generate -d '{ "model": "phi3", "prompt": "Why is the sky blue?", "stream": false }'

MannavaVivek · May 14, 2024, 8:31am

Yes, Ollama is accessible via curl and the python openAI api interface, but doesn’t seem to work when called via Rasa. The config is

pipeline:
- name: LLMCommandGenerator
  llm:
    model: "llama3:latest"
    type: openai
    openai_api_base: http://localhost:11434/v1
    openai_api_key: foobar

Rabi · September 24, 2024, 1:36pm

@MannavaVivek did you found any solution?

MannavaVivek · September 24, 2024, 3:55pm

With the upgraded version of Rasa(mine is 3.9.3), you can directly call the model

type: ollama model: “llama3.1:8b”

pkchoudhary1211 · September 27, 2024, 3:23pm

Did the update solve it?

Topic		Replies	Views
Issue with Ollama LLM Integration - Port Binding and Quota Exceeded - RASA CALM Rasa Pro CALM	11	334	March 31, 2025
Local LLM with RASA CALM Rasa Pro CALM	14	908	May 15, 2025
How to access currently used LLM Rasa Pro CALM	1	128	July 22, 2024
Can I use Rasa Pro developer edition without an OpenAI API key? Rasa Pro CALM	7	269	October 1, 2024
RASA not working with ollama Rasa Pro CALM	4	125	September 27, 2024

How to use Ollama models in Rasa CALM?

Related topics