Using real chat logs for training?

datistiquo · December 17, 2018, 3:50pm

Hey,

what methods are suitable to preprocess and prepare real dialogues as training data for Rasa Core? Up to now, I just would look at it and find stories with the pure eye. But maybe there some techniques? Especially when you have thousands of chat logs?

thanks! (@Juste)

Ghostvv · December 21, 2018, 11:15am

Unfortunately we don’t have additional techniques for now

martinevs · November 13, 2019, 10:26am

Have there been updates in Rasa that allow for this feature? We have transcripts from customer service calls that we would like to use as training data for our dialog and NLU. Apart from the fact that phone calls might have a different tone of voice and conversation style, would this be possible at all? Thanks!

Ghostvv · November 14, 2019, 12:42pm

do you mean end-to-end training? we are working on it, but it is not yet supported by rasa

martinevs · November 15, 2019, 9:52am

I’m not sure what you mean by end-to-end training. I guess one option would be to import the transcripts some how in Rasa X and annotate them. But ideally we would upload the transcripts and get new stories based on the transcripts. Is this something you are working on?

Ghostvv · November 15, 2019, 4:54pm

Ah, I thought you meant, you have transcripts of the whole dialogue

tyd · November 15, 2019, 5:28pm

Hi @martinevs. If your conversations are between a Rasa assistant and users, you can import them into Rasa X and annotate them there.

However, it sounds like your transcripts are between humans. In that case, you cannot import them into Rasa X because it is built for conversations between a Rasa assistant and users.

There are still many open research questions in the field of Natural Language Processing (NLP) around learning from not only unlabelled but also labelled conversations, so we have a long way to go towards being able to build an assistant from transcripts from humans.

Transcripts are really helpful for scoping out user goals and the capabilities of an assistant. Plus, you can use customer messages for NLU training data. However, at this point, it is still a very manual process and will require you to review those transcripts in order to define your domain and annotate training data.

martinevs · November 21, 2019, 9:18am

They are between humans indeed. Thanks for your reply!

losimons · May 2, 2020, 1:04pm

Hello Martine,

Out of curiosity: Did you find any solution for your problem?

Many thanks,

Lore

Topic		Replies	Views
Training with existing data Rasa Open Source	4	1202	January 20, 2022
Importing conversation logs without annotations to kick off intent training Getting Started with Rasa	2	305	March 11, 2020
How Can I Use existing HUMAN Chat Logs to Create Training Data (with Rasa X) [Deprecated] Rasa X Community Edition	0	482	April 27, 2021
Annotate large corpus / Import it into Rasa X [Deprecated] Rasa X Community Edition	3	498	March 12, 2020
Conversation log data set annotation Rasa Open Source	7	625	March 23, 2020

Using real chat logs for training?

Related topics