How to set pipeline config with other language

YYtheFenix · May 14, 2021, 11:10am

After i already test my code with splot task and it’s work well with English. Now, i change my data to Thai language, train and test. But, the default setting of pipeline is not provide accurate result anymore. I think it cause of tokenization. Therefore, i try to change tokenization method in pipeline as follows:

language: th

pipeline:

name: “SpacyNLP”

model: “xx_ent_wiki_sm”

name: “SpacyTokenizer”

Spacy lib and spacy model “xx_ent_wiki_sm” are both installed. But it still inaccurate. I have three questions:

The following settings is correct or incorrect ?
Are there any example for custom tokenization, featurizer and classifier with own custom .py?
What is the default settings[tokenization, featurizer, classifier model] of pipeline? [in case you input nothing]

Thank for your replying

Topic		Replies	Views
How to configure the pipeline using other language? Rasa Open Source	1	1734	September 30, 2019
Config for Spanish Bot Rasa Open Source	1	864	February 2, 2022
RASA Spacy sklearn pipe line Rasa Open Source	2	1483	September 9, 2018
Confusion on SpacyNLP pipeline Rasa Open Source	0	132	May 1, 2024
Have my own language tokenizer and specific classifiers Rasa Open Source	1	561	January 26, 2019

How to set pipeline config with other language

Related topics