FAQ - Mass integration from Excel - How to perform it?

Hello community,

For an FAQ purpose, I am looking for a way to massively integrate in Rasa an Excel file containing:

  • 100 questions (1 per line)
  • 100 intents (1 per question)
  • An average of 20 training phrases per question
  • 100 responses (1 per question)

How can I perform it massively and not one by one via Rasa X ?

Many thanks

@Baptiste its, easy if you only want to do the FAQs

Step1: You just need a .csv data, with questions column and corresponding answers column, it’s good if you provide some set of variations questions correspond with answers.

Step2: You can see the FuzzyWuzzy logic to extractOne, to fetch the data from .csv and check the score of words based on some set values, may be 80 is just fine.

Step3: Now, you are thinking, how I will train my questions based on Intents, so you only need one intents i.e questions and you need to write 4 line code which take the questions(.csv) and convert it into nlu.yml or question.yml in the rasa recommended format for training the bot.

Step4: All the above process, will be done in custom action i.e action.py

Note: The above mention process I just suggested based on FAQ : Q/A for one intent i.e question, you can do it for 99 more as per you need. i.e sales, location, product etc etc

Tips: In conversational AI design, we recommend that we should used minimum intents, that is easy to maintain and bot will also not get confused and even developer. So, if you can cut some intents, it will be great for your design. Good Luck!

Thank you @nik202! I have a few questions to ask because I am not sure your process address the requirement (let me phrase it in another way: I need a “tool” or a “way” that gets excel data and creates required rasa files)

  1. I didn’t know FuzzyWuzzy so I looked at this youtube video (https://www.youtube.com/watch?v=4L0Py4GkmPU) and it seems to me that the requirement is not addressed with FuzzyWuzzy. Am I right?
  2. Can you please share the 4 lines solution and how it address the question?
  3. It looks like the second part of the solution does not do any massive imports but it creates a seperate file that is then searched for a similar intent then tries to match with appropriate response for the given faq. Am I right?

Thank you again for your feedback

@Baptiste Hello! Its ok you can ask any questions, but trust me the process you are looking based on excel or csv, I have design it and giving me result, rest my choice of words or process can be confusing but I had shared you the idea.

  1. Yes, you just need to fine-tune based on FAQs i.e Q/A

  2. Please see this

     class GetAnswer(Action):
         def __init__(self):
             self.faq_data = pd.read_csv('./data/faq_d.csv')
             qs = list(self.faq_d['question'])
             with open("./data/nlu/faq.yml", "wt", encoding="utf-8") as f:
                 f.write('version: "2.0"\n')
                 f.write("nlu: \n- intent: question\n  examples: | \n")
                 for q in qs:
                     f.write(f"    - {q}\n") 
    

Its 7 line :stuck_out_tongue: excluding 2 basic one

Just, try this code and do thanks me later :wink:

  1. When the above piece of code fetches the data i,e question part and make the faq.yml file of that data which you want to train and then later it will go into the second code that’s FuzzyWuzzy and then return your response. For me, it’s a clean process, but yes you need to fine-tune.

The advantage of this process is just add add add and run run run and that in excel.

I hope this will solved your query. Good Luck!

@Baptiste I hope you doing good? and hope the above suggestion’s helped you, or if you found new solution please share with us and close this thread with the solution for others thanks.

do you have a github link to refer this experiment ?

@vaidehi no it’s my own code for the experiment.

Can you share the FuzzyWuzzy code