Can't extract entities for Arabic language

ahlam1234 · February 12, 2020, 10:26pm

it worked fine for intent classification but i couldn’t extract entities

this is my config file

language: “ar”

pipeline:

• name: “tokenizer_whitespace”

• name: “ner_crf”

• name: “intent_featurizer_count_vectors”

• name: “intent_classifier_tensorflow_embedding”

intent_tokenization_flag: true

intent_split_symbol: “_”

what should I do ?

rishier827 · February 13, 2020, 5:20am

did u use langdetect library? can you explain your problem in detail? I think i can help you me to face this kinda problem and solved it.

BimsaraGamage · May 11, 2020, 6:21am

@rishier827 I do entity extraction with Sinhalese languages. I’m facing the same kind of problems that @ahlam1234 is facing. rasa correctly classifies intents, but many problems with unrecognized entities. I would like to know how you would solve these kinds of problems. Thank you in advance.

rishier827 · May 11, 2020, 6:46am

@BimsaraGamage using langdetect library i solved the problem which not detecting Ar but i m not sure langdetect supports sinhala

rishier827 · May 11, 2020, 6:49am

@BimsaraGamage seems like langdetect not supporting sinhala but,

You need to create a new language profile. The easiest way to do it is to use the langdetect.jar tool, which can generate language profiles from Wikipedia abstract database files or plain text.

Wikipedia abstract database files can be retrieved from “Wikipedia Downloads” (http://download.wikimedia.org/). They form ‘(language code)wiki-(version)-abstract.xml’ (e.g. ‘enwiki-20101004-abstract.xml’ ).

given langdetct link shows how to add language please try it or if you found library which supports sinhala please inform me

BimsaraGamage · May 11, 2020, 7:17am

Thank you for the quick reply @rishier827 . I would certainly try your suggestion.

Topic		Replies	Views
Entity Recognition for (Non-English) Language Rasa Open Source	2	1008	April 15, 2020
Empty entities being returned by rasa nlu Rasa Open Source	5	1699	April 23, 2020
Intent classification failing when entity extraction is performed Getting Started with Rasa	4	173	December 19, 2018
Cannot get entity extraction to work with Rasa NLU Rasa Open Source	4	2178	October 15, 2019
Challenges with Extracting Arabic Entities in Rasa for a Multi-Intent Chatbot Rasa Open Source	0	22	April 20, 2025

Can't extract entities for Arabic language

Related topics