Spacy alpha tokenization language support

mniemiec · January 3, 2019, 8:29pm

Hi guys, I have a question about rasa and spacy intent classification. How can I use language from spacy alpha tokenization support? Or maybe I can’t and I have to use only finished langs?

This is my code

from rasa_nlu.training_data  import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

train_data = load_data('./rasa_dataset.json')
trainer = Trainer(config.load('./config_spacy.yaml'))
trainer.train(train_data)
model_directory = trainer.persist('/projects/')


from rasa_nlu.model import Metadata, Interpreter
interpreter = Interpreter.load(model_directory)

interpreter.parse(u"Where I can find the nearest restaurant?")

My config_spacy.yaml file has only:

language: "en"
pipeline: "spacy_sklearn"

It’s working perfectly fine on english model.

Thanks for any help.

MetcalfeTom · January 18, 2019, 3:36pm

Hi @mniemiec,

I’m not sure what you mean by alpha tokenisation - do you mean only including alpha tokens (i.e. only tokens made from alphabetical characters)? Or is it a separate language model?

Topic		Replies	Views
Custom spaCy language model, which parts do I need to train? Rasa Open Source	2	1230	July 15, 2019
How to configure the pipeline using other language? Rasa Open Source	1	1738	September 30, 2019
What features does Rasa NLU use from spacy? Rasa Open Source	0	669	February 21, 2019
Can not train rasa nlu with spacy models Getting Started with Rasa	3	383	September 15, 2020
Rasa NLU in python Getting Started with Rasa	10	248	March 19, 2019

Spacy alpha tokenization language support

Related topics