LLM_API_HEALTH_CHECK fails even though llm is available

GrigoriPerelmann · November 11, 2025, 8:47am

Hi my first question on this forum: I use Rasa 3.14 and I want to use a self hosted llm from my machine. I use LMStudio to deploy it on localhost. But I get an error, that the LLM_API_HEALTH_CHECK fails. I couldnt find what that health check request looks like so I could test it manually. My LLM is reachable and supports the openai-like api.

config.yml
recipe: default.v1
language: de
assistant_id: stern-factory
pipeline:
- name: CompactLLMCommandGenerator
  llm:
    model_group: self_hosted_llm
  flow_retrieval:
    active: false
policies:
- name: FlowPolicy

endpoints.yml
# Allow rephrasing of responses using a Rasa-hosted model
nlg:
  type: rephrase
  llm:
    model_group: self_hosted_llm

model_groups:
  - id: self_hosted_llm  
    models:
      - provider: self-hosted
        model: openai/gpt-oss-20b
        api_base: "http://localhost:1234/v1"
  # - id: rasa_command_generation_model
    # models:
    #   - provider: rasa
    #     model: rasa/command-generator-llama-3.1-8b-instruct
    #     api_base: "https://tutorial-llm.rasa.ai"

I followed the steps from the documentation, but it doesn’t work. Because after setting LLM_API_HEALTH_CHECK to True. I get the following error:

2025-11-11 08:41:55 INFO     rasa.shared.utils.health_check.health_check  - [info     ] Sending a test LLM API request for the component - ContextualResponseRephraser. config={'model': 'openai/gpt-oss-20b', 'provider': 'self-hosted', 'api_base': 'http://localhost:1234/v1', 'api_version': None, 'api_type': 'openai', 'use_chat_completions_endpoint': True} event_key=contextual_response_rephraser.init.send_test_llm_api_request

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

2025-11-11 08:41:57 ERROR    rasa.cli.train  - [error    ] Test call to the LLM API failed for component - ContextualResponseRephraser. config={'model': 'openai/gpt-oss-20b', 'provider': 'self-hosted', 'api_base': 'http://localhost:1234/v1', 'api_version': None, 'api_type': 'openai', 'use_chat_completions_endpoint': True} error=ProviderClientAPIException('
Original error: litellm.APIError: APIError: Hosted_vllmException - Connection error.)') event_key=contextual_response_rephraser.init.send_test_llm_api_request_failed

Did anybody have the same problem?

GrigoriPerelmann · November 11, 2025, 10:59am

Ok I solved it. For anyone who had the same problem. I followed the tutorial using a devcontainer. i forgot to use the docker variable for my host network.

Changing the config to:

     models:
       - provider: self-hosted
         model: openai/gpt-oss-20b
         api_base: "http://host.docker.internal:1234/v1"

did the trick

Topic		Replies	Views
Issue connecting RASA PRO CALM with LLM on self-hosted vllm server: Hosted_vllmException - 'str' object has no attribute 'model_dump' Rasa CALM rasa	2	177	November 28, 2024
Rasa 3.10.0 provider for local llm is giving error Rasa CALM	0	113	September 16, 2024
Issue connecting RASA PRO CALM with Embedding model on self-hosted vllm server: litellm.BadRequestError: LLM Provider NOT provided Rasa CALM	0	112	December 2, 2024
Ollama integraion, API_BASE key - Rasa Pro CALM Rasa CALM	0	215	November 4, 2024
Local llm error with enviromental variable OPENAI key Rasa CALM	5	585	March 17, 2025

LLM_API_HEALTH_CHECK fails even though llm is available

Related topics