I just test Rasa2 with SpacyNLP3, can not get the entity in response .
Codes in utl/rasa/spacy at main · utlai/utl · GitHub.
config.yaml
language: zh
pipeline:
- name: SpacyNLP # 预训练词向量
model: zh_core_web_trf
- name: SpacyTokenizer # 文本分词器
- name: SpacyEntityExtractor #文本特征化
- name: SpacyFeaturizer #特征提取器 将一句话变成一个向量
pooling: mean
- name: CountVectorsFeaturizer #创建用户信息和标签(意图和响应)的词袋表征 为意图分类和 response selection创建特征
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier #意图分类
epochs: 100
- name: EntitySynonymMapper #同义实体
- name: RegexFeaturizer
- name: ResponseSelector
epochs: 100
policies:
- name: MemoizationPolicy
- name: TEDPolicy
max_history: 5
epochs: 100
- name: MappingPolicy
domain.yaml
version: "2.0"
intents:
- opt_log
- opt_test
entities:
- print
- loglevel
- message
slots:
print:
type: text
influence_conversation: false
loglevel:
type: text
influence_conversation: false
message:
type: text
influence_conversation: false
responses:
utter_greate:
- text: "fine"
session_config:
session_expiration_time: 60
carry_over_slots_to_new_session: true
nlu.yml
version: "2.0"
nlu:
- intent: opt_log
examples: |
- [打印]{"entity":"print", "value":"syn_print"}[消息]{"entity":"loglevel", "value":"lkp_loglevel"}[内容](message)
- [打印]{"entity":"print", "value":"syn_print"}[消息]{"entity":"loglevel", "value":"lkp_loglevel"}日志[内容](message)
I send below request to nlu server.
{
"text": "打印错误测试"
}
Why the return msg has no lookup named “loglevel” and entity named “message”? Just has synonym “print” value in it which are right.
{
"text": "打印错误测试",
"intent": {
"id": 6128636546715775035,
"name": "opt_log",
"confidence": 0.9967185854911804
},
"entities": [
{
"entity": "print",
"start": 0,
"end": 2,
"confidence_entity": 0.9995416402816772,
"value": "syn_print",
"extractor": "DIETClassifier",
"processors": [
"EntitySynonymMapper"
]
}
],
"intent_ranking": [
{
"id": 6128636546715775035,
"name": "opt_log",
"confidence": 0.9967185854911804
},
{
"id": -1888143233846630085,
"name": "opt_test",
"confidence": 0.0032813996076583862
}
],
"response_selector": {
"all_retrieval_intents": [],
"default": {
"response": {
"id": null,
"responses": null,
"response_templates": null,
"confidence": 0.0,
"intent_response_key": null,
"utter_action": "utter_None",
"template_name": "utter_None"
},
"ranking": []
}
}
}
By the way,
1. How to remove below warning?
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
2. How to remove below Warning?
/usr/local/lib/python3.8/site-packages/rasa/utils/train_utils.py:558: UserWarning: constrain_similarities is set to `False`. It is recommended to set it to `True` when using cross-entropy loss. It will be set to `True` by default, Rasa Open Source 3.0.0 onwards.
rasa.shared.utils.io.raise_warning(
/usr/local/lib/python3.8/site-packages/rasa/utils/train_utils.py:531: UserWarning: model_confidence is set to `softmax`. It is recommended to try using `model_confidence=linear_norm` to make it easier to tune fallback thresholds.
rasa.shared.utils.io.raise_warning(