Feedback on ConveRT Model + Rasa NLU

JulianGerhard · December 1, 2019, 6:03pm

Hey,

you need to use ngrok tcp 22 as a command and ensure, that ssh is running. As an alternative, we could use TeamViewer or something else.

Regards

arbazkhan971 · December 1, 2019, 6:04pm

i am fine with team viewer

JulianGerhard · December 1, 2019, 6:06pm

Hi,

julian.gerhard@susiandjames.com

Regards

BahlingerTh · December 2, 2019, 7:23am

Sorry, but does this stuff really belong to “Announcements” and should it be broadcasted via E-Mail?

DivyashaAgrawal · December 2, 2019, 9:02am

Is it possible to persist the conveRT model instead of downloading the model on fly during training?

dakshvar22 · December 2, 2019, 9:50am

@DivyashaAgrawal The ConveRT model is downloaded from TFHub for the first time it is run on any new machine. It is persisted in TFhub’s own cache for retrieving it the next time it is run. So the next time time training or inferencing is run on the same machine, it won’t be downloaded but just retrieved from the cache.

dakshvar22 · December 2, 2019, 9:53am

Hi @arbazkhan971 Are you still facing any issue with TF installation? If yes, I would suggest opening a new thread on the forum to discuss about it since it’s a much more narrower topic and may not pertain to everyone on the thread. Feel free to post the link of the thread here if you wish to. Thanks

Emma · December 2, 2019, 10:43am

Hey @BahlingerTh

Thank you for bringing this to our attention, we have adjusted the tracking settings for the announcement section so that it will only notify the community for the first announcement, and no longer notify for every response.

If you would like to make further adjustments to your personal settings, you can do this by clicking on the tracking circle in the top right of each topic page or going to ‘notifications’ in your user profile settings.

psds01 · December 9, 2019, 6:33pm

Tried ConveRT for intent classification but not with rasa.

Just loaded the tfhub model, created features from that model and tried fitting LogisticRegression, SVC, 2xFC TF layers on top of these features.

ConveRT Featurizer + LogisticRegression: test results

                                            micro avg       0.70      0.70      0.70      7934
                                            macro avg       0.55      0.45      0.49      7934
                                         weighted avg       0.69      0.70      0.69      7934

ConveRT Featurizer + SVM(kernel=linear): test results

                                            micro avg       0.71      0.71      0.71      7934
                                            macro avg       0.54      0.47      0.49      7934
                                         weighted avg       0.70      0.71      0.70      7934

ConveRT Featurizer + 2xFC TF layers with relu activation for both the layers: test results

Don’t really have classification_report for this but mean test accuracy is ~0.66.

All the train stats for all above models are >0.99 (micro, mean, weighted).

The most surprising observation with this experiment is that ConveRT + some_classifier does not noticeably outperform the good ol’ CountVectorizer + LogisticRegression pipeline.

CountVectorizer + LogisticRegression : test results

                                            micro avg       0.71      0.71      0.71      7934
                                            macro avg       0.55      0.46      0.48      7934
                                         weighted avg       0.70      0.71      0.70      7934

This may be specific to my data so here’s some context:

real customer data
mix of English and Hindi language
LOTS of spelling errors
50+ intents, follow zipf approximately w.r.t. frequency

Wondering why this is the case. @matthen @dakshvar22

matthen · December 10, 2019, 12:17am

We typically train with L2 normalisation of the sentence encodings, and very high dropout.

rslater · December 12, 2019, 3:26pm

Pretty Close to what we had before, definitely helping our edge cases. (100+ intents)

"micro avg": 
        "precision": 0.9821576351418452,
        "recall": 0.9821576351418452,
        "f1-score": 0.9821576351418452,
        "support": 30209

  "weighted avg": 
    "precision": 0.9818330519201411,
    "recall": 0.9821576351418452,
    "f1-score": 0.9817390316733208,
    "support": 30209

rslater · December 12, 2019, 3:30pm

The data model is English Words Only–that may be a source of issue.

I also was going through the paper last night on ConveRT (I can never spell it right) and I noticed it uses sub word tokens but not character level embeddings so i bet spelling has a huge impact on performance. I was going to experiment with that myself this am.

bamba518 · January 17, 2020, 9:35am

Do you have the training code so that we can make a convoRT for different languages and then use it in Rasa?

dakshvar22 · January 17, 2020, 11:52am

The original authors haven’t open-sourced the exact training pipeline, but here is the original repository! GitHub - PolyAI-LDN/polyai-models: Neural Models for Conversational AI

tiziano · February 14, 2020, 10:13am

Hi, I’m getting this error:

ModuleNotFoundError: Cannot find class 'ConveRTTokenizer' from global namespace. Please check that there is no typo in the class name and that you have imported the class into the global namespace.

Does anybody have a clue on what it can be?

dakshvar22 · February 18, 2020, 12:19pm

@tiziano What rasa version are you using? ConveRTTokenizer was added >=1.7.0 releases

tiziano · February 18, 2020, 12:22pm

Oh ok, I thought, reading from the main post of this topic, that 1.5.0 was enough

tiziano · February 19, 2020, 4:43pm

I updated to 1.7, but still getting the same error…

tiziano · February 24, 2020, 10:16am

Anybody?

sibbsnb · May 20, 2020, 12:05am

did you try 1.10.1?

Topic		Replies	Views
Universal Sentence Encoder Rasa Open Source	3	1513	October 15, 2019
Injecting pretrained sentence level semantic features to the DIETClassifier Rasa Open Source	0	460	December 31, 2021
ConveRT has been taken down Rasa Open Source	2	570	September 30, 2020
Improve Rasa NLU model Rasa Open Source	5	2155	October 15, 2019
Rasa NLU without Rasa Core Getting Started with Rasa confidence	4	189	August 23, 2019

Feedback on ConveRT Model + Rasa NLU

Related topics