Questions about data format and composite entities

Hi guys! I have two questions about rasa:

  1. Does rasa x support json format for nlu training data? It seems it only reads markdown from the data folder, and uploading it through UI automatically convert it into md format.

  2. Does the rasa x support composite entities now? I’ve been using an older version of rasa with the Innatis library developed by CarLabs. However, it is not completely bug free and does not support the latest rasa. Or do anyone have any suggestions on dealing with composite entities like dialogflow with rasa?

Thanks!

1 Like

I’ve written a component for composite entities that I’ve just updated for rasa 1.x. You could try if that one fits your use-case.

Hey Benjamin! Thanks for the response! Your library works really great. I realized that your extractor requires to tag the root entity and it will figure out its composite. Is it possible to nest more than one layer like dialogflow? That is to say, composite entity A contains composite entity B, which contains entity C. Thanks in advance!

Nesting composite entities is currently not supported. Right now, longer patterns take precedence over shorter patterns.

Example: You have two patterns A and B. A is longer than B (as in: the pattern has more characters). Both patterns match in your user query and B is entailed in A. Then pattern B will be ignored.

The idea has been that longer patterns probably contain the same information as the shorter patterns “plus something extra” and you would therefore not need the smaller pattern in that case.

I can see the value of nested composites, though. If this feature is important to you, feel free to open an issue on the github page and I’ll see what I can do!

Sure. I guess it is not the big issue now since I can modify my data to be one composite layer at most. I have it worked with rasa 0.15.0, but when I tried to migrated this morning since you release the supported version, I encountered an issue of failing to read the composite json file. Could you take a look? Here is the traceback:

Training NLU model… Traceback (most recent call last): File “/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py”, line 193, in _run_module_as_main “main”, mod_spec) File “/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py”, line 85, in _run_code exec(code, run_globals) File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/main.py”, line 81, in main() File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/main.py”, line 70, in main cmdline_arguments.func(cmdline_arguments) File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/cli/train.py”, line 132, in train_nlu fixed_model_name=args.fixed_model_name, File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/train.py”, line 372, in train_nlu fixed_model_name=fixed_model_name, File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/train.py”, line 391, in _train_nlu_with_validated_data config, nlu_data_directory, _train_path, fixed_model_name=“nlu” File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/nlu/train.py”, line 89, in train training_data = load_data(data, nlu_config.language) File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/nlu/training_data/loading.py”, line 56, in load_data data_sets = [_load(f, language) for f in files] File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/nlu/training_data/loading.py”, line 56, in data_sets = [_load(f, language) for f in files] File “/Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa/nlu/training_data/loading.py”, line 115, in _load raise ValueError(“Unknown data format for file {}”.format(filename)) ValueError: Unknown data format for file /var/folders/yd/0wytdp2n6bx4j7nh_km3_xr00000gn/T/tmp2vgbkpkg/e3c2371162544153998fe9dac300943a_nlu.json

This is the traceback from rasa x UI:

2019-06-05 10:17:23 WARNING py.warnings - /Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa_composite_entities/composite_entity_extractor.py:129: UserWarning: Failed to load composite entitiesfile from “/var/folders/yd/0wytdp2n6bx4j7nh_km3_xr00000gn/T/tmph9fkb9ve/nlu/composite_entities.json” ‘file from “{}”’.format(composite_entities_file)

2019-06-05 10:30:09 WARNING py.warnings - /Users/brian/rasax/rasax/lib/python3.7/site-packages/rasa_composite_entities/composite_entity_extractor.py:70: UserWarning: The CompositeEntityExtractor could not load the train file. "The CompositeEntityExtractor could not load "

Thanks!

I think the issure here is that “composite.json” was not created successfully to the model file. Any idea?

Thanks!

I think this is due to rasa X training being different than training via the script / http server. I’ve written the component before rasa X was a thing, I’m gonna test training with rasa X myself over the weekend and will apply a fix. I’ve created an issue over at [BUG] Training doesn't work with rasa X · Issue #4 · BeWe11/rasa_composite_entities · GitHub using your information, please report any new insights there. Sorry for the inconvience :stuck_out_tongue:

Hi! I figure out how to fix this. I will log my solution on github for your reference :blush:

Hi everyone:

I’m using rasa_composite_entities (great library) and I’m trying to set a rasa x environment to start testing my bot. But when I go to the NLU Training page, my examples appear multiple times and also like this

add creamadd creamadd cream

Have you experience this when using composite_entities?

Thanks!

Hey @juliamendoim,

if you are sure that there is an issue only when you use the composite entity extractor, please create a Github issue over here describing the problem and I will try to fix it.

I just tried again, and I think it is the composite_entities library. I’ll raise an issue. Thanks.

PS: Before I do that, though, I post this print of what is going on in the NLU Train window in case someone knows what’s happening:

@BeWe11 does it support md format? can i add patterns in JSON with training files in markdown format?