Training with Enterprise Policy and testing with rasa shell always predicts action_listen

I am trying to build a simple RAG application, and have trained a model with Enterprise Policy. The training was successful, but testing it with rasa shell is always predicting action_listem with confidence 0.00 and keeps on asking for input again and again. Attaching the logs below, I have followed all the steps from here: Chat With Your Text Documents

Logs:

> """
> 2024-05-03 14:50:51 INFO     rasa_plus.tracing.config  - No endpoint for tracing type available in endpoints.yml,tracing will not be configured.
> 2024-05-03 14:50:51 INFO     rasa_plus.telemetry  - Initialised global config file with Rasa Pro telemetry tracking set to True.
> 2024-05-03 14:50:51 DEBUG    urllib3.connectionpool  - Starting new HTTPS connection (1): api.segment.io:443
> 2024-05-03 14:50:52 DEBUG    urllib3.connectionpool  - https://api.segment.io:443 "POST /v1/identify HTTP/1.1" 200 21
> 2024-05-03 14:50:53 DEBUG    urllib3.connectionpool  - Starting new HTTPS connection (1): api.segment.io:443
> 2024-05-03 14:50:55 DEBUG    urllib3.connectionpool  - https://api.segment.io:443 "POST /v1/track HTTP/1.1" 200 21
> 2024-05-03 14:50:55 DEBUG    rasa.cli.utils  - Parameter 'credentials' was not set. Using default location 'credentials.yml' instead.
> 2024-05-03 14:50:55 INFO     root  - Connecting to channel 'cmdline' which was specified by the '--connector' argument. Any other channels will be ignored. To connect to all given channels, omit the '--connector' argument.
> 2024-05-03 14:50:55 DEBUG    sanic.root  - Sanic-CORS: Configuring CORS with resources: {'/*': {'origins': [''], 'methods': 'DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT', 'allow_headers': ['.*'], 'expose_headers': 'filename', 'supports_credentials': True, 'max_age': None, 'send_wildcard': False, 'automatic_options': True, 'vary_header': True, 'resources': {'/*': {'origins': ''}}, 'intercept_exceptions': True, 'always_send': True}}
> 2024-05-03 14:50:55 DEBUG    rasa.core.utils  - Available web server routes: 
> /webhooks/rest                                     GET                            rasa_core_no_api.custom_webhook_CmdlineInput.health
> /webhooks/rest/webhook                             POST                           rasa_core_no_api.custom_webhook_CmdlineInput.receive
> /                                                  GET                            rasa_core_no_api.hello
> 2024-05-03 14:50:55 INFO     root  - Starting Rasa server on http://0.0.0.0:5005
> 2024-05-03 14:50:55 DEBUG    rasa.core.utils  - Using the default number of Sanic workers (1).
> 2024-05-03 14:50:56 DEBUG    urllib3.connectionpool  - Starting new HTTPS connection (1): api.segment.io:443
> 2024-05-03 14:50:57 DEBUG    urllib3.connectionpool  - https://api.segment.io:443 "POST /v1/track HTTP/1.1" 200 21
> 2024-05-03 14:50:57 DEBUG    rasa.core.tracker_store  - Connected to InMemoryTrackerStore.
> 2024-05-03 14:50:57 DEBUG    rasa.core.lock_store  - Connected to lock store 'InMemoryLockStore'.
> 2024-05-03 14:50:57 DEBUG    rasa.core.nlg.generator  - Instantiated NLG to 'TemplatedNaturalLanguageGenerator'.
> 2024-05-03 14:50:57 INFO     rasa.core.processor  - Loading model models/20240503-145031-coal-deque.tar.gz...
> 2024-05-03 14:50:58 DEBUG    rasa.engine.storage.local_model_storage  - Extracted model to '/var/folders/vn/685bqpr57vg84klz1d510f1m0000gp/T/tmp6trxg_ui'.
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=NLUMessageConverter constructor=load kwargs={} node_name=nlu_message_converter
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=FlowsProvider constructor=load kwargs={} node_name=flows_provider
> 2024-05-03 14:50:58 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'flows_provider' was requested for reading.
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=NLUCommandAdapter constructor=load kwargs={} node_name=run_NLUCommandAdapter0
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=LLMCommandGenerator constructor=load kwargs={} node_name=run_LLMCommandGenerator1
> 2024-05-03 14:50:58 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'train_LLMCommandGenerator1' was requested for reading.
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=RegexMessageHandler constructor=load kwargs={} node_name=run_RegexMessageHandler
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=DomainProvider constructor=load kwargs={} node_name=domain_provider
> 2024-05-03 14:50:58 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'domain_provider' was requested for reading.
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=CommandProcessorComponent constructor=load kwargs={} node_name=command_processor
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=FlowPolicy constructor=load kwargs={} node_name=run_FlowPolicy0
> 2024-05-03 14:50:58 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'train_FlowPolicy0' was requested for reading.
> 2024-05-03 14:50:58 DEBUG    rasa.core.policies.policy  - Couldn't load metadata for policy 'FlowPolicy' as the persisted metadata couldn't be loaded.
> 2024-05-03 14:50:58 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=EnterpriseSearchPolicy constructor=load kwargs={} node_name=run_rasa_plus.ml.EnterpriseSearchPolicy1
> 2024-05-03 14:50:58 DEBUG    rasa.shared.utils.llm  - [debug    ] llmfactory.create.embedder     config={'model_name': 'lanwuwei/GigaBERT-v4-Arabic-and-English', 'model_kwargs': {'device': 'mps'}, 'encode_kwargs': {'normalize_embeddings': True}, '_type': 'huggingface'}
> 2024-05-03 14:50:58 INFO     sentence_transformers.SentenceTransformer  - Load pretrained SentenceTransformer: lanwuwei/GigaBERT-v4-Arabic-and-English
> 2024-05-03 14:50:58 DEBUG    urllib3.connectionpool  - Starting new HTTPS connection (1): huggingface.co:443
> 2024-05-03 14:51:00 DEBUG    urllib3.connectionpool  - https://huggingface.co:443 "HEAD /lanwuwei/GigaBERT-v4-Arabic-and-English/resolve/main/modules.json HTTP/1.1" 404 0
> 2024-05-03 14:51:00 WARNING  sentence_transformers.SentenceTransformer  - No sentence-transformers model found with name lanwuwei/GigaBERT-v4-Arabic-and-English. Creating a new one with MEAN pooling.
> 2024-05-03 14:51:00 DEBUG    urllib3.connectionpool  - https://huggingface.co:443 "HEAD /lanwuwei/GigaBERT-v4-Arabic-and-English/resolve/main/config.json HTTP/1.1" 200 0
> 2024-05-03 14:51:01 DEBUG    urllib3.connectionpool  - Starting new HTTPS connection (1): huggingface.co:443
> 2024-05-03 14:51:01 DEBUG    urllib3.connectionpool  - https://huggingface.co:443 "HEAD /lanwuwei/GigaBERT-v4-Arabic-and-English/resolve/main/model.safetensors HTTP/1.1" 404 0
> Some weights of the model checkpoint at lanwuwei/GigaBERT-v4-Arabic-and-English were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.decoder.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
> - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
> - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
> 2024-05-03 14:51:02 DEBUG    urllib3.connectionpool  - https://huggingface.co:443 "HEAD /lanwuwei/GigaBERT-v4-Arabic-and-English/resolve/main/tokenizer_config.json HTTP/1.1" 200 0
> 2024-05-03 14:51:02 INFO     rasa_plus.ml.enterprise_search_policy  - [info     ] enterprise_search_policy.load  config={'priority': 6, 'vector_store': {'type': 'faiss', 'source': './docs'}, 'llm': {'model': 'samalingo-arabic-chat:latest', 'max_tokens': 20, 'type': 'openai', 'openai_api_base': 'http://0.0.0.0:11434', 'openai_api_key': 'foo'}, 'embeddings': {'model_name': 'lanwuwei/GigaBERT-v4-Arabic-and-English', 'model_kwargs': {'device': 'mps'}, 'encode_kwargs': {'normalize_embeddings': True}, '_type': 'huggingface'}}
> 2024-05-03 14:51:02 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'train_rasa_plus.ml.EnterpriseSearchPolicy1' was requested for reading.
> 2024-05-03 14:51:02 INFO     rasa_plus.information_retrieval.faiss  - [info     ] information_retrieval.faiss_store.load_index path=PosixPath('/var/folders/vn/685bqpr57vg84klz1d510f1m0000gp/T/tmpj8yvyrzv/train_rasa_plus.ml.EnterpriseSearchPolicy1/documents_faiss')
> 2024-05-03 14:51:02 DEBUG    faiss.loader  - Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
> 2024-05-03 14:51:02 INFO     faiss.loader  - Loading faiss.
> 2024-05-03 14:51:02 INFO     faiss.loader  - Successfully loaded faiss.
> 2024-05-03 14:51:03 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'train_rasa_plus.ml.EnterpriseSearchPolicy1' was requested for reading.
> 2024-05-03 14:51:03 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=RuleOnlyDataProvider constructor=load kwargs={} node_name=rule_only_data_provider
> 2024-05-03 14:51:03 DEBUG    rasa.engine.storage.local_model_storage  - Resource 'rule_only_data_provider' was requested for reading.
> 2024-05-03 14:51:03 DEBUG    rasa.graph_components.providers.rule_only_provider  - Failed to load rule-only data from a trained 'RulePolicy'. Providing empty rule-only data instead.
> 2024-05-03 14:51:03 DEBUG    rasa.engine.graph  - [debug    ] graph.node.loading_component   clazz=DefaultPolicyPredictionEnsemble constructor=load kwargs={} node_name=select_prediction
> 2024-05-03 14:51:03 INFO     root  - Rasa server is up and running.
> 2024-05-03 14:51:03 INFO     root  - Enabling coroutine debugging. Loop id 12089895008.
> Bot loaded. Type a message and press enter (use '/stop' to exit): 
> Your input ->  what is zatca?                                                                                                                
> 2024-05-03 14:51:15 DEBUG    rasa.core.lock_store  - Issuing ticket for conversation '33fe55ad990043f795960031c4d02391'.
> 2024-05-03 14:51:15 DEBUG    rasa.core.lock_store  - Acquiring lock for conversation '33fe55ad990043f795960031c4d02391'.
> 2024-05-03 14:51:15 DEBUG    rasa.core.lock_store  - Acquired lock for conversation '33fe55ad990043f795960031c4d02391'.
> 2024-05-03 14:51:15 DEBUG    rasa.core.tracker_store  - Could not find tracker for conversation ID '33fe55ad990043f795960031c4d02391'.
> 2024-05-03 14:51:15 DEBUG    rasa.core.tracker_store  - No event broker configured. Skipping streaming events.
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - Starting a new session for conversation ID '33fe55ad990043f795960031c4d02391'.
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.actions.policy_prediction action_name=action_session_start policy_name=None prediction_events=[]
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.actions.log          action_name=action_session_start rasa_events=[SessionStarted(type_name: session_started), ActionExecuted(action: action_listen, policy: None, confidence: None)]
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.slots.log            slots={}
> 2024-05-03 14:51:15 DEBUG    rasa.engine.runner.dask  - Running graph with inputs: {'__message__': [UserMessage(text: what is zatca?, sender_id: 33fe55ad990043f795960031c4d02391)], '__tracker__': DialogueStateTracker(sender_id: 33fe55ad990043f795960031c4d02391)}, targets: ['run_RegexMessageHandler'] and ExecutionContext(model_id='21b2837101f2437b91fe0b9ec4119042', should_add_diagnostic_data=False, is_finetuning=False, node_name=None).
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=NLUMessageConverter fn=convert_user_message node_name=nlu_message_converter
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=FlowsProvider fn=provide_inference node_name=flows_provider
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=NLUCommandAdapter fn=process node_name=run_NLUCommandAdapter0
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=LLMCommandGenerator fn=process node_name=run_LLMCommandGenerator1
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=DomainProvider fn=provide_inference node_name=domain_provider
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=RegexMessageHandler fn=process node_name=run_RegexMessageHandler
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.message.parse        parse_data_entities=[] parse_data_intent={'name': None, 'confidence': 0.0} parse_data_text=what is zatca?
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - Logged UserUtterance - tracker now has 4 events.
> 2024-05-03 14:51:15 DEBUG    rasa.core.actions.action  - Validating extracted slots: 
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.extract.slots        action_extract_slot=action_extract_slots len_extraction_events=0 rasa_events=[]
> 2024-05-03 14:51:15 DEBUG    rasa.engine.runner.dask  - Running graph with inputs: {'__tracker__': DialogueStateTracker(sender_id: 33fe55ad990043f795960031c4d02391)}, targets: ['command_processor'] and ExecutionContext(model_id='21b2837101f2437b91fe0b9ec4119042', should_add_diagnostic_data=False, is_finetuning=False, node_name=None).
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=FlowsProvider fn=provide_inference node_name=flows_provider
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=CommandProcessorComponent fn=execute_commands node_name=command_processor
> 2024-05-03 14:51:15 DEBUG    rasa.dialogue_understanding.processor.command_processor  - [debug    ] command_processor.clean_up_commands.final_commands command=[]
> 2024-05-03 14:51:15 DEBUG    rasa.engine.runner.dask  - Running graph with inputs: {'__tracker__': DialogueStateTracker(sender_id: 33fe55ad990043f795960031c4d02391), '__endpoints__': <rasa.core.utils.AvailableEndpoints object at 0x2d0964f10>}, targets: ['select_prediction'] and ExecutionContext(model_id='21b2837101f2437b91fe0b9ec4119042', should_add_diagnostic_data=False, is_finetuning=False, node_name=None).
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=RuleOnlyDataProvider fn=provide node_name=rule_only_data_provider
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=DomainProvider fn=provide_inference node_name=domain_provider
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=EnterpriseSearchPolicy fn=predict_action_probabilities node_name=run_rasa_plus.ml.EnterpriseSearchPolicy1
> 2024-05-03 14:51:15 DEBUG    rasa.shared.utils.llm  - [debug    ] llmfactory.create.llm          config={'_type': 'openai', 'request_timeout': 10, 'temperature': 0.0, 'max_tokens': 20, 'model_name': 'gpt-3.5-turbo', 'max_retries': 1, 'model': 'samalingo-arabic-chat:latest', 'openai_api_base': 'http://0.0.0.0:11434', 'openai_api_key': 'foo'}
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=FlowsProvider fn=provide_inference node_name=flows_provider
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=FlowPolicy fn=predict_action_probabilities node_name=run_FlowPolicy0
> 2024-05-03 14:51:15 DEBUG    rasa.engine.graph  - [debug    ] graph.node.running_component   clazz=DefaultPolicyPredictionEnsemble fn=combine_predictions_from_kwargs node_name=select_prediction
> 2024-05-03 14:51:15 DEBUG    rasa.core.policies.ensemble  - Made prediction using user intent.
> 2024-05-03 14:51:15 DEBUG    rasa.core.policies.ensemble  - Added `DefinePrevUserUtteredFeaturization(False)` event.
> 2024-05-03 14:51:15 DEBUG    rasa.core.policies.ensemble  - Predicted next action using FlowPolicy.
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - Predicted next action 'action_listen' with confidence 0.00.
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.actions.policy_prediction action_name=action_listen policy_name=FlowPolicy prediction_events=[<rasa.shared.core.events.DefinePrevUserUtteredFeaturization object at 0x2ffd64ca0>]
> 2024-05-03 14:51:15 DEBUG    rasa.core.processor  - [debug    ] processor.actions.log          action_name=action_listen rasa_events=[]
> 2024-05-03 14:51:15 DEBUG    rasa.core.tracker_store  - No event broker configured. Skipping streaming events.
> 2024-05-03 14:51:15 DEBUG    rasa.core.lock_store  - Deleted lock for conversation '33fe55ad990043f795960031c4d02391'.
> Your input ->                                                                                                                                
> """

Any leads on how to solve this?

Hi @vishnupriyavr . Could you share your config.yml

Hi @souvikg10 , Thanks for responding, here is the config

recipe: default.v1
language: en
assistant_id: 20230405-114328-tranquil-mustard

pipeline:
  - name: LLMCommandGenerator
    llm:
      model_name: gpt-3.5-turbo:latest
      max_tokens: 20
      type: openai
      openai_api_base: "http://127.0.0.1:11434"
      openai_api_key: foo
    user_input:
      max_characters: 420
    flow_retrieval:
      active: false
    embeddings:
      type: "huggingface"
      model_name: "MhmdSyd/Embedding-Arabic-English"
      model_kwargs:
        device: "mps"
      encode_kwargs:
        normalize_embeddings: True

policies:
  - name: EnterpriseSearchPolicy
    llm:
      model_name: gpt-3.5-turbo:latest
      max_tokens: 20
      type: openai
      openai_api_base: "http://127.0.0.1:11434"
      openai_api_key: foo
    embeddings:
      type: "huggingface"
      model_name: "MhmdSyd/Embedding-Arabic-English"
      model_kwargs:
        device: "mps"
      encode_kwargs:
        normalize_embeddings: True
    vector_store: 
      type: "faiss" 
      source: "./docs"

Just to give some context, I am using ollama model served on my local, and to make use of Open AI compatibility as suggested by Chris in my previous post, I have copied the ollama model to GPT 3.5 turbo, and using it as the LLM.

you are missing the flow policy in your config?

How are you triggering RAG? possibly using a pattern_search correct?

for this you will also need to include the Flow policy as well.

Hi @vishnupriyavr

Can you please explain in more detail on how did you use Ollama model serve?