Korean NLU

akelad · February 18, 2019, 5:36pm

Hi everyone,

We were wondering if anyone has any experience using Rasa NLU in Korean? Specifically, dealing with tokenization as this is a little bit more complicated than just whitespace tokenization.

Would be great if you could share your experiences

Thanks, Akela

asnal05 · March 4, 2019, 3:57am

Hi

Me and my colleagues are currently dealing with that ! Please let me know if you need help

Regards

kkiruru · June 25, 2019, 4:03am

@asnal05님, 혹시 한국어 NLU 잘 동작하나요?

Nari · September 24, 2019, 8:42am

Hi @asnal05,

This is Nari Kim, and we plan to implement an application for class project in Korean language. Have you implemented a custom pipeline for Korean (형태소분석기, etc)? Any information will be appreciated.

Thank you!

asnal05 · September 26, 2019, 7:58am

네 잘 작동합니다

asnal05 · September 26, 2019, 8:03am

Yes we have implemented your own pipeline.

We have used Mecab for tokenization but it was not robust against OOV hence we’ve built another tokenizer which works better !

We have tried various pipelines for NLU.
First, we trained the basic CRF and StarSpace provided from RASA on the same dataset (i.e. nlu.md). But we have found that it maybe a better idea to use different sets of data for enttiy extraction (EE) and intent classification (IC). That was your second approach. We then found that train EE and IC JOINTLY could improve the accuracy and we obtained the “best” result among the three pipelines.

Nari · September 27, 2019, 8:24am

@asnal05 Thank you very much for the info! Did you share your code or any material online? It would be very helpful for us.

Thank you!

akelad · October 2, 2019, 7:37am

We would also be interested in seeing this if possible

asnal05 · October 7, 2019, 8:49am

Dear @akelad and @Nari

Sorry for the late reply ! Unfortunately I am not allowed to release the code by the policy of our company.

Nari · October 7, 2019, 9:02am

@asnal05 Oh, it was a company project! Understood. Thank you for the response.

Topic		Replies	Views
How does Rasa NLU Intent Classification use Tokenization? Rasa Open Source	1	739	April 15, 2019
Rasa for Japanese language Rasa Open Source	6	1132	October 25, 2021
The entity annotation in `nlu.yml` is misaligned with the tokenized result by MicroTokenizer in Chinese Rasa Open Source	1	259	February 3, 2023
How to use Japanese Text with Rasa (Mecab-Tokenization) Rasa Open Source	3	1760	July 4, 2019
Rasa is also good for languages other than English? Rasa Open Source	2	1421	September 19, 2019

Korean NLU

Related topics