Feedback on ConveRT Model + Rasa NLU

Hey,

you need to use ngrok tcp 22 as a command and ensure, that ssh is running. As an alternative, we could use TeamViewer or something else.

Regards

i am fine with team viewer

Hi,

julian.gerhard@susiandjames.com

Regards

Sorry, but does this stuff really belong to “Announcements” and should it be broadcasted via E-Mail?

1 Like

Is it possible to persist the conveRT model instead of downloading the model on fly during training?

@DivyashaAgrawal The ConveRT model is downloaded from TFHub for the first time it is run on any new machine. It is persisted in TFhub’s own cache for retrieving it the next time it is run. So the next time time training or inferencing is run on the same machine, it won’t be downloaded but just retrieved from the cache.

Hi @arbazkhan971 Are you still facing any issue with TF installation? If yes, I would suggest opening a new thread on the forum to discuss about it since it’s a much more narrower topic and may not pertain to everyone on the thread. Feel free to post the link of the thread here if you wish to. Thanks

Hey @BahlingerTh

Thank you for bringing this to our attention, we have adjusted the tracking settings for the announcement section so that it will only notify the community for the first announcement, and no longer notify for every response.

If you would like to make further adjustments to your personal settings, you can do this by clicking on the tracking circle in the top right of each topic page or going to ‘notifications’ in your user profile settings.

2 Likes

Tried ConveRT for intent classification but not with rasa.

Just loaded the tfhub model, created features from that model and tried fitting LogisticRegression, SVC, 2xFC TF layers on top of these features.

ConveRT Featurizer + LogisticRegression: test results

                                            micro avg       0.70      0.70      0.70      7934
                                            macro avg       0.55      0.45      0.49      7934
                                         weighted avg       0.69      0.70      0.69      7934

ConveRT Featurizer + SVM(kernel=linear): test results

                                            micro avg       0.71      0.71      0.71      7934
                                            macro avg       0.54      0.47      0.49      7934
                                         weighted avg       0.70      0.71      0.70      7934

ConveRT Featurizer + 2xFC TF layers with relu activation for both the layers: test results

Don’t really have classification_report for this but mean test accuracy is ~0.66.

All the train stats for all above models are >0.99 (micro, mean, weighted).

The most surprising observation with this experiment is that ConveRT + some_classifier does not noticeably outperform the good ol’ CountVectorizer + LogisticRegression pipeline.

CountVectorizer + LogisticRegression : test results

                                            micro avg       0.71      0.71      0.71      7934
                                            macro avg       0.55      0.46      0.48      7934
                                         weighted avg       0.70      0.71      0.70      7934

This may be specific to my data so here’s some context:

  1. real customer data
  2. mix of English and Hindi language
  3. LOTS of spelling errors
  4. 50+ intents, follow zipf approximately w.r.t. frequency

Wondering why this is the case. @matthen @dakshvar22

We typically train with L2 normalisation of the sentence encodings, and very high dropout.

Pretty Close to what we had before, definitely helping our edge cases. (100+ intents)

"micro avg": 
        "precision": 0.9821576351418452,
        "recall": 0.9821576351418452,
        "f1-score": 0.9821576351418452,
        "support": 30209

  "weighted avg": 
    "precision": 0.9818330519201411,
    "recall": 0.9821576351418452,
    "f1-score": 0.9817390316733208,
    "support": 30209

The data model is English Words Only–that may be a source of issue.

I also was going through the paper last night on ConveRT (I can never spell it right) and I noticed it uses sub word tokens but not character level embeddings so i bet spelling has a huge impact on performance. I was going to experiment with that myself this am.

Do you have the training code so that we can make a convoRT for different languages and then use it in Rasa?

The original authors haven’t open-sourced the exact training pipeline, but here is the original repository! GitHub - PolyAI-LDN/polyai-models: Neural Models for Conversational AI

Hi, I’m getting this error:

ModuleNotFoundError: Cannot find class 'ConveRTTokenizer' from global namespace. Please check that there is no typo in the class name and that you have imported the class into the global namespace.

Does anybody have a clue on what it can be?

@tiziano What rasa version are you using? ConveRTTokenizer was added >=1.7.0 releases

Oh ok, I thought, reading from the main post of this topic, that 1.5.0 was enough

I updated to 1.7, but still getting the same error…

Anybody?

did you try 1.10.1?