On premise Arabic chatbot

raoshivaprasad · October 1, 2018, 2:10pm

I want to build an on premise Arabic AI chatbot with RASA. I understand RASA supports on premise deployments. But how about support for Arabic. Do I need to use any external translators?

souvikg10 · October 1, 2018, 2:24pm

You can train with your NLU data in arabic using the embedding classifier.

Keep in mind you will need a lot of examples

raoshivaprasad · October 3, 2018, 1:36pm

Dear Souvik,

In my case user will type in Arabic script and bot needs to reply in Arabic unlike English to Bengali(Tumi kemn acho? — How are you?)

Hope following your approach with Rasa NLU Chatbot with spaCy and FastText should work?

Thanks, Shivaprasad

souvikg10 · October 3, 2018, 1:54pm

Yeah it will still work with the embedding classifier even if your script is arabic

Here’s how the classifier actually works

Each word in your NLU training data is first tokenised ( now in arabic a whitespace tokeniser should actually work meaning like English - I am going to eat can be tokenised as [“I”, “am”, “going”,“to”,“eat”] using a whitespace in between. I am not sure if that is possible in Arabic or not. )

After tokenisation, keeping one word as an initial vector , it will create vectors for all the words in your training data including your intent name and using the vector for the intent, it will create a non-linear classifier. based on Facebook’s starspace algorithm. there is a paper on that

However for Out of vocabulary words, it won’t give a useful answer unless you have a lot of training data. This is one way to making a classifier completely language independent.

The second approach is using the vectors generated by FastText. You can convert the vectors from FastText to Spacy’s and use them to create a sklearn_classifier. However this won’t work for entity extraction as we don’t have any tagger or parser .

Check this documentation on different components

datistiquo · October 4, 2018, 9:32am

Aren’t those words just 0 then and if all are none the intent is None?

Could you use this fasttext then together with a custom NER_CRF?

souvikg10 · October 6, 2018, 8:23pm

I haven’t tested the language agnostic CRF and you are right for OOV, where the Intent should come as None

Topic		Replies	Views
How to build rasa chatbot in arabic? Rasa Open Source	5	2148	September 8, 2020
I'm trying to do research about arabic in rasa nlu with tensorflow Rasa Open Source	8	2302	February 23, 2021
Rasa x Arabic language [Deprecated] Rasa X Community Edition	7	1359	August 3, 2021
How to train Rasa for other language Rasa Open Source	32	4967	August 25, 2020
Is it possible to be used in other languages? Rasa Open Source	23	7565	June 29, 2022

On premise Arabic chatbot

Related topics