As i see it is not possible to send yaml training data as json to the rasa rest api. It will handle the training data in the request as markdown. So it is not possible to migrate all training data to yaml when u use the rest api and send over as json.
It is possible to send the training data as yaml string but there is no documentation how the string must be look like. The code tells me:
dict(
domain=str(training_data),
config=str(training_data),
training_files=str(temp_dir),
output=model_output_directory,
force_training=request.args.get("force_training", False),
)
but what is this training_files exactly?
mloubser
(Melinda Loubser)
February 12, 2021, 9:59am
2
It is possible, but the documentation is not clear. See these issues:
opened 06:57AM - 27 Oct 20 UTC
closed 09:15AM - 09 Jan 23 UTC
type:docs :book:
area:rasa-oss
area:rasa-oss/infrastructure
area:rasa-oss/server
**Rasa version**:
Version: class InlineResponse200
{ version: 1.10.3 minimum… CompatibleVersion: 1.10.0 }
**Rasa SDK version** (if used & relevant):
Generated JAVA Source with openapi spec here [https://rasa.com/docs/rasa/spec/rasa.yml](https://rasa.com/docs/rasa/spec/rasa.yml)
**Rasa X version** (if used & relevant): n/a
**Python version**: n/a - rasa helm charts for kubrerenetes
**Operating system** (windows, osx, ...): n/a - rasa helm charts for kubrerenetes
**Issue**:
Connection to Api is ok, when I try to train a model I get an error without clear notice what's wrong. Setting up
JSONTrainingRequest trainingRequest = new JSONTrainingRequest();
trainingRequest.setNlu(cleanrasanlu);
trainingRequest.setConfig(cleanrasaconfig);
trainingRequest.setStories(cleanrasastories);
trainingRequest.setDomain(cleanrasadomain);
trainingRequest.saveToDefaultModelDirectory(true);
trainingRequest.force(false);
ObjectMapper m = new ObjectMapper();
log.info("start training: {}", m.writeValueAsString(trainingRequest));
ApiResponse<File> f = trainapi.trainModelWithHttpInfo(trainingRequest,true,false);
List<String> filenames = f.getHeaders().get("filename");
log.info("trainthread {} training result={}", getName(), filenames.get(0));
find the values within traingRequest Structure here
{"domain":"version: \"2.0\"\n\nintents:\n - affirm\n - deny\n - greet\n - thankyou\n - goodbye\n - search_concerts\n - search_venues\n - compare_reviews\n - bot_challenge\n - nlu_fallback\n - how_to_get_started\n\n\nsession_config:\n session_expiration_time: 60 # value in minutes\n carry_over_slots_to_new_session: true","config":"language: en\n\npipeline:\n - name: \"WhitespaceTokenizer\"\n - name: \"RegexFeaturizer\"\n - name: \"LexicalSyntacticFeaturizer\"\n - name: \"CountVectorsFeaturizer\"\n - name: \"CountVectorsFeaturizer\"\n analyzer: \"char_wb\"\n min_ngram: 1\n max_ngram: 4\n - name: \"DIETClassifier\"\n epochs: 100\n - name: FallbackClassifier\n threshold: 0.4\n ambiguity_threshold: 0.1\n - name: \"EntitySynonymMapper\"\n\npolicies:\n - name: TEDPolicy\n max_history: 5\n epochs: 200\n batch_size: 50\n max_training_samples: 300\n - name: MemoizationPolicy\n - name: RulePolicy","nlu":"version: \"2.0\"\nnlu:\n - intent: greet\n examples: |\n - hi\n - hello\n - how are you\n - good morning\n - good evening\n - hey\n\n - intent: goodbye\n examples: |\n - bye\n - goodbye\n - ciao\n\n - intent: thankyou\n examples: |\n - thanks\n - thank you\n - thanks friend\n\n - intent: search_concerts\n examples: |\n - Find me some good concerts\n - Show me concerts\n - search concerts\n\n - intent: search_venues\n examples: |\n - Find me some good venues\n - Show me venues\n - search venues\n\n - intent: compare_reviews\n examples: |\n - compare reviews\n - show me a comparison of the reviews\n\n - intent: how_to_get_started\n examples: |\n - how do I get started\n - what can I do\n - start\n\n - intent: affirm\n examples: |\n - yes\n - yeah\n - yep\n\n - intent: deny\n examples: |\n - nope\n - no\n - absolutely not\n ","stories":"#","force":false,"save_to_default_model_directory":true}
**Error (including full traceback)**:
<-- HTTP/1.1 500 Internal Server Error (93ms)
Connection: keep-alive
Keep-Alive: 5
Content-Length: 221
Content-Type: application/json
OkHttp-Sent-Millis: 1603780528856
OkHttp-Received-Millis: 1603780528948
{"version":"1.10.3","status":"failure","message":"An unexpected error occurred during training. Error: expected str, bytes or os.PathLike object, not NoneType","reason":"TrainingError","details":{},"help":null,"code":500}
<-- END HTTP (221-byte body)
<span class="error">[2020-10-27 07:35:28,949]</span>-<span class="error">[pool-22-thread-1]</span> ERROR io.be1.circular.messageprocessing.MessageProcessorRasaNLU - trainthread RASANLU exception=io.be1.circular.messageprocessing.rasaclient.ApiException: Internal Server Error stack=io.be1.circular.messageprocessing.rasaclient.ApiException: Internal Server Error
at io.be1.circular.my-app//io.be1.circular.messageprocessing.rasaclient.ApiClient.handleResponse(ApiClient.java:925)
at io.be1.circular.my-app//io.be1.circular.messageprocessing.rasaclient.ApiClient.execute(ApiClient.java:841)
at io.be1.circular.my-app//io.be1.circular.messageprocessing.rasaclient.api.ModelApi.trainModelWithHttpInfo(ModelApi.java:938)
at io.be1.circular.my-app//io.be1.circular.messageprocessing.MessageProcessorRasaNLU.lambda$starttrainthread$0(MessageProcessorRasaNLU.java:320)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at java.base/java.lang.Thread.run(Thread.java:832)
<span class="error">[2020-10-27 07:35:28,949]</span>-<span class="error">[pool-22-thread-1]</span> INFO io.be1.circular.messageprocessing.MessageProcessorRasaNLU - trainthread RASANLU stoped thread: pool-22-thread-1 duration=0sec
<-- HTTP/1.1 200 OK (42ms)
**Command or request that led to error**:
--> POST [http://rasa-x-1603718536-rasa-production.rasa.svc.cluster.local:5005/model/train?save_to_default_model_directory=true&force_training=false&token=rasaToken](http://rasa-x-1603718536-rasa-production.rasa.svc.cluster.local:5005/model/train?save_to_default_model_directory=true&force_training=false&token=rasaToken) HTTP/1.1
Content-Type: application/json; charset=utf-8
Content-Length: 2046
Accept: application/json
User-Agent: Swagger-Codegen/1.0.0/java
{"domain":"version: \"2.0\"\n\nintents:\n - affirm\n - deny\n - greet\n - thankyou\n - goodbye\n - search_concerts\n - search_venues\n - compare_reviews\n - bot_challenge\n - nlu_fallback\n - how_to_get_started\n\n\nsession_config:\n session_expiration_time: 60 # value in minutes\n carry_over_slots_to_new_session: true","config":"language: en\n\npipeline:\n - name: \"WhitespaceTokenizer\"\n - name: \"RegexFeaturizer\"\n - name: \"LexicalSyntacticFeaturizer\"\n - name: \"CountVectorsFeaturizer\"\n - name: \"CountVectorsFeaturizer\"\n analyzer: \"char_wb\"\n min_ngram: 1\n max_ngram: 4\n - name: \"DIETClassifier\"\n epochs: 100\n - name: FallbackClassifier\n threshold: 0.4\n ambiguity_threshold: 0.1\n - name: \"EntitySynonymMapper\"\n\npolicies:\n - name: TEDPolicy\n max_history: 5\n epochs: 200\n batch_size: 50\n max_training_samples: 300\n - name: MemoizationPolicy\n - name: RulePolicy","nlu":"version: \"2.0\"\nnlu:\n - intent: greet\n examples: |\n - hi\n - hello\n - how are you\n - good morning\n - good evening\n - hey\n\n - intent: goodbye\n examples: |\n - bye\n - goodbye\n - ciao\n\n - intent: thankyou\n examples: |\n - thanks\n - thank you\n - thanks friend\n\n - intent: search_concerts\n examples: |\n - Find me some good concerts\n - Show me concerts\n - search concerts\n\n - intent: search_venues\n examples: |\n - Find me some good venues\n - Show me venues\n - search venues\n\n - intent: compare_reviews\n examples: |\n - compare reviews\n - show me a comparison of the reviews\n\n - intent: how_to_get_started\n examples: |\n - how do I get started\n - what can I do\n - start\n\n - intent: affirm\n examples: |\n - yes\n - yeah\n - yep\n\n - intent: deny\n examples: |\n - nope\n - no\n - absolutely not\n ","stories":"#","force":false,"save_to_default_model_directory":true}
--> END POST (2046-byte body)
<-- HTTP/1.1 200 OK (19ms)
**Content of configuration file (config.yml)** (if relevant):
<pre>language: en
pipeline:
* name: "WhitespaceTokenizer"
* name: "RegexFeaturizer"
* name: "LexicalSyntacticFeaturizer"
* name: "CountVectorsFeaturizer"
* name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
* name: "DIETClassifier"
epochs: 100
* name: FallbackClassifier
threshold: 0.4
ambiguity_threshold: 0.1
* name: "EntitySynonymMapper"
policies:
* name: TEDPolicy
max_history: 5
epochs: 200
batch_size: 50
max_training_samples: 300
* name: MemoizationPolicy
* name: RulePolicy</pre>
**Content of domain file (domain.yml)** (if relevant):
<pre>version: "2.0"
intents:
* affirm
* deny
* greet
* thankyou
* goodbye
* search_concerts
* search_venues
* compare_reviews
* bot_challenge
* nlu_fallback
* how_to_get_started
session_config:
session_expiration_time: 60 # value in minutes
carry_over_slots_to_new_session: true</pre>
opened 03:53AM - 24 Nov 20 UTC
closed 03:20PM - 27 Jan 21 UTC
type:bug
area:rasa-oss
<!-- THIS INFORMATION IS MANDATORY - YOUR ISSUE WILL BE CLOSED IF IT IS MISSING.… If you don't know your Rasa version, use `rasa --version`.
Please format any code or console output with three ticks ``` above and below.
If you are asking a usage question (e.g. "How do I do xyz") please post your question on https://forum.rasa.com instead -->
**Rasa version**:2.0.6
**Rasa SDK version** (if used & relevant):
**Rasa X version** (if used & relevant):
**Python version**:3.8.5
**Operating system** (windows, osx, ...):osx
**Issue**: Using the [http-api](https://rasa.com/docs/rasa/pages/http-api#operation/trainModel) & example provided unable to train the model
**Error (including full traceback)**:
```
2020-11-24 09:22:19 DEBUG rasa.server - Extracting JSON payload with Markdown training data from request body.
2020-11-24 09:22:19 DEBUG rasa.server - request payload is:{'domain': 'intents:\n - greet\n - goodbye\n - affirm\n - deny\n - mood_great\n - mood_unhappy\n\nresponses:\n utter_greet:\n - text: "Hey! How are you?"\n\n utter_cheer_up:\n - text: "Here is something to cheer you up:"\n image: "https://i.imgur.com/nGF1K8f.jpg"\n\n utter_did_that_help:\n - text: "Did that help you?"\n\n utter_happy:\n - text: "Great carry on!"\n\n utter_goodbye:\n - text: "Bye"', 'config': 'language: en\npipeline: supervised_embeddings\npolicies:\n - name: MemoizationPolicy\n - name: TEDPolicy', 'nlu': '- intent: greet\n examples: |\n - hey\n - hello\n - hi\n\n- intent: goodbye\n examples: |\n - bye\n - goodbye\n - have a nice day\n - see you\n\n- intent: affirm\n examples: |\n - yes\n - indeed\n\n- intent: deny\n examples: |\n - no\n - never\n\n- intent: mood_great\n examples: |\n - perfect\n - very good\n - great\n\n- intent: mood_unhappy\n examples: |\n - sad\n - not good\n - unhappy', 'responses': "chitchat/ask_name: - text: my name is Sara, Rasa's documentation bot! chitchat/ask_weather: - text: it's always sunny where I live", 'stories': '- story: happy path\n steps:\n - intent: greet\n - action: utter_greet\n - intent: mood_great\n - action: utter_happy\n\n- story: sad path 1\n steps:\n - intent: greet\n - action: utter_greet\n - intent: mood_unhappy\n - action: utter_cheer_up\n - action: utter_did_that_help\n - intent: affirm\n - action: utter_happy\n\n- story: sad path 2\n steps:\n - intent: greet\n - action: utter_greet\n - intent: mood_unhappy\n - action: utter_cheer_up\n - action: utter_did_that_help\n - intent: deny\n - action: utter_goodbye\n\n- story: say goodbye\n steps:\n - intent: goodbye\n - action: utter_goodbye', 'force': False, 'save_to_default_model_directory': True}
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/config.yml' is 'unk'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/domain.yml' is 'rasa_yml'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/nlu.md' is 'unk'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/responses.md' is 'unk'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/stories.md' is 'unk'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/nlu.md' is 'unk'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/responses.md' is 'unk'.
2020-11-24 09:22:19 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/stories.md' is 'unk'.
2020-11-24 09:22:20 INFO rasa.shared.utils.validation - The 'version' key is missing in the training data file /var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/domain.yml. Rasa Open Source will read the file as a version '2.0' file. See https://rasa.com/docs/rasa/training-data-format.
2020-11-24 09:22:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/domain.yml' is 'rasa_yml'.
2020-11-24 09:22:20 INFO rasa.shared.utils.validation - The 'version' key is missing in the training data file /var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/domain.yml. Rasa Open Source will read the file as a version '2.0' file. See https://rasa.com/docs/rasa/training-data-format.
2020-11-24 09:22:20 DEBUG rasa.shared.importers.importer - Added 0 training data examples from the story training data.
2020-11-24 09:22:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of '/var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/domain.yml' is 'rasa_yml'.
2020-11-24 09:22:20 INFO rasa.shared.utils.validation - The 'version' key is missing in the training data file /var/folders/cp/1v9r04wd7mz63xtx_klyr9g40000gn/T/tmpguneyh_s/domain.yml. Rasa Open Source will read the file as a version '2.0' file. See https://rasa.com/docs/rasa/training-data-format.
No training data given. Please provide stories and NLU data in order to train a Rasa model using the '--data' argument.
2020-11-24 09:22:20 ERROR rasa.server - Ran training, but it finished without a trained model.
```
**Command or request that led to error**:
```
```
**Content of configuration file (config.yml)** (if relevant):
```yml
```
**Content of domain file (domain.yml)** (if relevant):
```yml
```
Hi @mbukovy , it appears our documentation is in need of an update to match the YAML spec.
You can use the /model/train endpoint with YAML by setting the request header’s Content-Type to application/x-yaml. Inside the request body, you’ll pass in a single YAML formatted string containing the configuration, domain definitions, and training data.
For example:
stories:
- story: My story
steps:
- intent: greet
- action: utter_greet
rules:
- rule: My rule
steps:
- intent: greet
- actio…
Try this:
curl --request POST \
--url http://localhost:5005/model/train \
--header 'Content-Type: application/x-yaml' \
--data '---
intents:
- greet
language: en
nlu:
- examples: |
- hi
- hello
intent: greet
- examples: |
- goodbye
- bye
intent: bye
pipeline:
- name: WhitespaceTokenizer
- name: CountVectorsFeaturizer
- name: DucklingHTTPExtractor
- epochs: 1
name: DIETClassifier
policies:
- name: RulePolicy
responses:
utter_greet:
- text: Hi
rules:
- rule: "My rule"
steps:
- intent: greet
- action: utter_greet
stories:
- steps:
- intent: greet
- action: utter_greet
story: "My story"
'