chatz
(Maria )
August 14, 2024, 9:00am
1
Hello everyone I was trying to deploy a local llm with RASA PRO and finally I found the solution here is the details if anyone needs it:
I have installed text-generation-webui → link
Then:
I started the server with ./start_linux.sh
Loaded the model through “Model” tab
In “Session” tab I selected openai, api, listen and pressed Apply flags
In rasa endpoints.yml:
nlg:
type: rephrase
rephrase_all: true
llm:
model: 'model_gemma_27b_it'
model_name : 'model_gemma_27b_it'
type: "openai"
openai_api_key: "NULL"
openai_api_base: http://127.0.0.1:5000/v1
request_timeout: 800
If you have an error
AttributeError: module ‘openai’ has no attribute ‘error’
you have to install this:
pip install openai==0.28.1
2 Likes
Sanjukta.bs
(Sanjukta Biswas)
September 12, 2024, 6:53pm
2
Hey what changes should I be making to my config for this to work! Also I dont have any access to open api key how can I bypass open api keys. Because it keeps popping up.
Any help will be appreciated.
Thanks
Try and use a huggingface model
(Mixtral would be fine)
Sanjukta.bs
(Sanjukta Biswas)
September 15, 2024, 6:22am
4
I want to use local models, I was trying ollama but it is taking a lot of time to generate a reply.
Thus I am stuck.
That’s the only stopping point with HF
Try and see if you can use vLLM
chatz
(Maria )
September 16, 2024, 8:51am
6
Hello, if the model takes too much time to generate I believe the problem is that your system is struggling to load the model. Maybe you should try to improve it by reducing the characters generated or other configuration variables.
Sanjukta.bs
(Sanjukta Biswas)
September 16, 2024, 11:08am
7
More than taking time, it is predicting wrong flows! Is there any good demo that we can follow that uses local llms instead of open ai?