Hello, I am trying to use models served through Ollama into Rasa CALM? I could not find it anywhere in the documentation, any leads or tips would be helpful
Hi Vishni,
The easiest way I’ve found to use Ollama (presuming you’re running the Ollama server locally) is to make use of its OpenAI-compatible entry point.
Your config would look similar to:
pipeline:
- name: LLMCommandGenerator
llm:
model: "wizardlm2:7b"
max_tokens: 20
type: openai
openai_api_base: http://localhost:11434/v1
openai_api_key: foobar
And, of course, you can replace the value for model
with whichever model you have downloaded to your Ollama installation.
Hope this helps!
Hi Chris, Thanks for the quick response! Sure, this is of great help, will try it out.
2024-05-13 17:28:19 WARNING langchain.llms.base - Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=7).
2024-05-13 17:28:34 WARNING langchain.llms.base - Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 10.0 seconds as it raised Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=7).
2024-05-13 17:28:51 ERROR rasa.utils.log_utils - [error ] llm_command_generator.llm.error error=Timeout(message="Request timed out: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=7)", http_status=None, request_id=None)
I tried to use Ollama as instructed, and it threw me this error. I suspect its the same issue discussed here and according to the status of the issue, doesn’t seem to be getting a fix anytime soon. So can anyone suggest alternatives for running local models?
Hi Vivek,
Are you able to access your locally running Ollama server with curl? e.g.
curl http://localhost:11434/api/generate -d '{ "model": "phi3", "prompt": "Why is the sky blue?", "stream": false }'
Yes, Ollama is accessible via curl and the python openAI api interface, but doesn’t seem to work when called via Rasa. The config is
pipeline:
- name: LLMCommandGenerator
llm:
model: "llama3:latest"
type: openai
openai_api_base: http://localhost:11434/v1
openai_api_key: foobar
@MannavaVivek did you found any solution?
With the upgraded version of Rasa(mine is 3.9.3), you can directly call the model
type: ollama model: “llama3.1:8b”
Did the update solve it?