Is it possible to be used in other languages?

Can RASA-NLU be used for Sinhala Language? If needed what should be done?

Hey @uthpala-era. Yes, I know some people from the community who successfully built a bot with Rasa in Sinhala language. To do so, you should use Tensorflow Embedding pipeline which allows you to build bots with Rasa in any language that can be tokenized (Sinhala in your case). You can read more about this pipeline here.

To implement your bot, you should follow the regular implementation process: create training data for NLU and Core models (NLU examples should, of course, be in your chosen language), define the tensorflow embedding pipeline and use it to train the NLU model. The process of training the Rasa Core model is no different than any other language.

1 Like

Thank you @Juste , I have tried an example for creating a chatbot for English language. It worked nicely. And the problem is when i switched to Sinhala. In my config.json iโ€™ve used pipeline as โ€˜spacy_sklearnโ€™ . And also I have data.json with data in sinhala language. And in templates section of domain.yml , I have included sinhala text responses. However due to these unicode characters, when I try to train the data, I get an error as shown below. <CoreError: error code 3: Unable to load any data from source yaml file: Path โ€˜/โ€™. So that I canโ€™t continue. Could u pls guide me to get through this?

Hello @uthpala-era. I wouldnโ€™t recommend using spacy because there are no word vectors for Sinhala language yet. Instead, I would suggest you using the tensorflow_embedding pipeline which allows you to build assistants regardless of the language. Regarding the error you are getting - do you have intetns, entities or action names which contain Sinhala characters?

1 Like

Thanks again for responding @Juste . I seeโ€ฆI am building this chatbot for a research. That is why i tried to use spacy_sklearn pipeline. โ€˜Building Word Vectorsโ€™ wonโ€™t this be possible for me?

And I do not have intents, entities or action names with Sinhala characters.

my sample domain.yml file is like this.

intents:

  • greet
  • goodbye

actions:

  • utter_greet
  • utter_goodbye

templates: utter_greet:

  • text: โ€œเท„เทเถบเท’โ€

utter_goodbye:

  • text: โ€œเถฑเทเท€เถญ เท„เถธเท”เท€เท™เถธเท”โ€

Is this exactly how your domain file is formatted? If yes, then the issue is because of the formatting - utter_goodbye is outside the templates section. Can you double check that? Here is how a domain should be formatted.

1 Like

@Juste โ€ฆFortunately after several tries it worked for me. It could be a formatting issue. Thanks for the guidance. And Just today I realised the vedio tutorial I had followed was yours, โ€œCreating a chatbot with Rasa NLU and Rasa Coreโ€ , I should mention it was very helpful to me & I followed it. Great work.

Glad to hear that! :slight_smile: Sometimes itโ€™s just a space in the wrong place or an indetnatation that makes it break, but I am glad you solved the issue. Also happy to hear that the tutorial was helpful! :slight_smile:

Isnโ€™t there any possibility for us to develop the word vectors for other languages(in my case i want to develop word vectors for sinhala language) ?

SpaCy allows you to add custom languages. Here is a guide which you can check out. Alternatively, you can play around with the word vectors provided by fasttext (I can see they have it for Sinhalese)

1 Like

Thanks a lot.I will try those out.

@Juste Could you please kindly help me on another issue. I am now using Rasa,the latest version(not Rasa X). Thanks to you guys it is pretty simple now, we can train using rasa train and run bot using, rasa run So i have integrated my bot with slack. When i type simple greet messages it nicely replies. But when there exist an entity , some times the bot stucks there. So is there any mechanism for the bot developer to check what is the error caused. Could you pls simly tell me where the error log is?

If you can reply in yr earliest i am very grateful.Thanks.

sir, could you tell me as you told that you made a chatbot with sinhala language what kind of changes you made in rasa structure bcoz i want to make a chatbot with hindi language so i want to know what sort of changes i need to make in rasa to make understand and give response in hindi language. please respond.

sir could you guide me how to make rasa based chatbot with hindi language having both input and output as hindi phrases. please respond me if you have any idea.

@uthpala-era and @Juste can you both please help me to get some information about creating the chatbot from Sinhala language? Thanks in advance.

Hey @navod @Raghav70007 i have created and deployed on chatbot in korean language. You can test this in korean only.

Link to bot:

https://thesampark.co.in/guest/conversations/production/b353c68f23e747579564db3239a7b49d

intent:greet

  • ์•ผ
  • ์•ˆ๋…•ํ•˜์„ธ์š”
  • ์—ฌ๋ณด์„ธ์š”
  • ์ข‹์€ ์•„์นจ
  • ์ข‹์€ ์ €๋…
  • ์ €๊ธฐ์š”

intent:goodbye

  • ์•ˆ๋…•
  • ์•ˆ๋…•
  • ์ฃผ์œ„์— ๋‹น์‹ ์„๋ณด๊ณ 
  • ๋‚˜์ค‘์— ๋ด

intent:affirm

  • ์˜ˆ
  • ์ฐธ์œผ๋กœ
  • ๋ฌผ๋ก ์ด์•ผ
  • ๊ทธ ์ข‹์€ ์†Œ๋ฆฌ
  • ๋งž์•„

intent:deny

  • ์•„๋‹ˆ
  • ์ ˆ๋Œ€
  • ๋‚œ ๊ทธ๋ ‡๊ฒŒ ์ƒ๊ฐํ•˜์ง€ ์•Š์•„
  • ์ข‹์•„ํ•˜์ง€ ์•Š์•„
  • ์ ˆ๋Œ€ ์•ˆ๋ผ
  • ์•„๋‹ˆ์•ผ

intent:mood_great

  • ์™„๋ฒฝ
  • ์•„์ฃผ ์ข‹์•„
  • ์ข‹์•„
  • ๋†€๋ผ์šด
  • ํ›Œ๋ฅญํ•œ
  • ๊ธฐ๋ถ„์ด ์•„์ฃผ ์ข‹์•„
  • ๋‚˜๋Š” ์ž˜ ์ง€๋‚ด๊ณ ์žˆ์–ด
  • ๋‚œ ๊ดœ์ฐฎ์•„

intent:mood_unhappy

  • ์Šฌํผ
  • ๋„ˆ๋ฌด ์Šฌํผ
  • ๋ถˆํ–‰
  • ๋‚˜์˜๋‹ค
  • ๋งค์šฐ ๋‚˜์˜๋‹ค
  • ๋”์ฐํ•œ
  • ๋”์ฐํ•œ
  • ์•„์ฃผ ์ข‹์€ํ•˜์ง€
  • ๋งค์šฐ ์Šฌํ”„๋‹ค
  • ๋„ˆ๋ฌด ์Šฌํผ

intent:bot_challenge

  • ๋‹น์‹ ์€ ๋ด‡์ž…๋‹ˆ๊นŒ?
  • ๋‹น์‹ ์€ ์ธ๊ฐ„์ž…๋‹ˆ๊นŒ?
  • ๋ด‡๊ณผ ๋Œ€ํ™”ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?
  • ์ธ๊ฐ„๊ณผ ๋Œ€ํ™”ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

use these line to test it.

Read this to build your own:

Hi @athenasaurav, can you share the pipeline from your config.yml.

Here is my configuration file:

language: kr

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
policies:
- name: KerasPolicy
  epochs: 200
  max_history: 3
- name: MemoizationPolicy
  max_history: 3

thank you for this. Iโ€™m building a chatbot in Urdu language. How do I check if Urdu is supported?

@athenasaurav thank you very much for the reply. But the link you provided there is not working.