How to use language articles right? (German, French, Spanish)

Hello,

I need to write a chat-bot in German. So far everything is working out with rasa. The only problem I have is that I don’t know how to work with German articles like der/die/das. (Don´t dismiss me. I know how to use them right. I only don’t know how to let my bot use them)

It is hard to explain this in English because the English language only has one article.

I want to ask in german “Kannst du mir einen Apfel geben?” (in English:“Could you give me an apple?”)

and the bot should answer “Hier ist der Apfel”( in English “here your apple”)

But if I ask for a banana the bot need to say “Hier ist die Banane”.

I can’t hard code the articles in. Like e.g write for the

entities:

-Die Banane

-Der Apfel

-…

Because the user could ask the question without the article. So the bot will not recognise the object, if I ask it without the article.

Does anybody has a solution for this problem?

I think this will also appear many other languages like French, Spanish,…

At the moment I solved it by let the bot saying “Hier dein Gegenstand Apfel” (in English “Here your object apple”)

But this solution makes the bot sounds really robotic.

do you have access to a dictionary that could return you the correct article based on the word?

1 Like

Hey @tebot, According to @Ghostvv’s suggestion the MediaWiki API could be helpful in your case. Through MediaWiki API you have access to Wiktionary content.

“Apfel”-Example:

Kind regards, Tristan

1 Like

in this case you can create a custom action to populate slots together with article from extracted entity

we plan to introduce special customizable action that will be called automatically after user utterance: substitute auto slot filling with customizable action · Issue #6267 · RasaHQ/rasa · GitHub I think it’ll be a good place for this code

Thank you for the fast responds.

I like the idea with the MediaWiki API and it looks really promising. Sadly I can’t work on this right now, because I got some other task which have at the moment a higher priority. As soon as I will try it I will report you my results.

Thanks a lot.

I tried it with a few APIs but sadly it gave me everything back but not the article or the gender of the noun.

So I solved it by calling Wiktionary without an API, for example https://de.wiktionary.org/wiki/Apfel and searched on the whole site for “Genus”. Genus is kind of the gender of the noun. So I cut out the first letter after “Genus:” and got the gender of the noun. With the gender of the noun I could build the article I needed. It is a really inefficient way but it works.

I uploaded the function on GitHub

You can just copy it can use it if you need.

I imported the function in my actions and added this code:

        noun=str(tracker.get_slot('noun')).title()
        gender=german_gender_wiki.search(noun)
        
        if(gender=="N"or gender=="M"):
            articel="dein "
            #SlotSet("articel", "dein")
        elif(gender=="F"):
            articel="deine "
            #SlotSet("articel","deine")
        else:
            articel="deine "#I gave the error case the acticle of the plural because this function dont recognice plural
        
        answer="Hier " + articel+ noun
        dispatcher.utter_message(answer)

This solution is ok. Could be definitely better but it works.

I tried it in German and it worked. I tested it with some French words, I think it also worked there but I can’t really tell because my French is not that good.

So if some of you try it out in other languages please let me know if it worked out.

Another way would be to write all words you need in an extra file or a list, but with the articles they have. The reason why I didn’t do it is that you always need to update the list/file when you add new words to your nlu.md so you need to do everything twice.

Hey @tebot ,

if you just need the articles for the german language you can use DEMorphy. I found this after some research. With just a few lines of code you get the right gender (and even more information).

  1. from demorphy import Analyzer
  2. analyzer = Analyzer(char_subs_allowed=True)
  3. s = analyzer.analyze(u"Apfel")
  4. print(s[0].gender)

It’s easy to use, fast and you don’t have to scan wiktionary everytime. I tested it with around 20 words.

An alternative could be IWNLP but I didn’t test it.

Kind regards, Tristan

1 Like