ENTITY EXTRACTION ISSUE

shubham · July 12, 2019, 4:41am

I am facing issue while extracting entity which has same pattern

Tcode is T3WKFR
order no is MS5WDF

both sentence are similar except its beginning, how can I differentiate both and make entity extraction accurate.

training data

intent:Transaction_details

Tcode is T3WKFR
with Tcode VSRDGB
Tcode is GSFRGB
Tcode is IFFCL / ZPRCL
with Tcode 4G6RKS
Tcode is VLVF0CA/VF100
with Tcode NLXP67CA/AU330

intent:Order_details

order no is ZTEGCJ
with order no JTJ7GQ
order no is SSW8N8
with order no LK2SM1
order no is MS5WDF
order no is PS28SS
with order no 3R2DDS

@akelad @JulianGerhard @Juste @juste_petr

shanushawan · July 12, 2019, 4:45am

Am also facing similiar issue can anyone help??? @akelad @JulianGerhard @Juste @shubham

JiteshGaikwad · July 12, 2019, 5:00am

hey @shubham, I can suggest you to try out Regex Feature for these problem, you can add the patterns for the above sentences:

shubham · July 12, 2019, 6:58am

it’s still not working for me, entity extraction is not accurate. can anyone help me with this ?

where the should do the changes in my training data or should I do something EntityExtractor?

@akelad @JulianGerhard @Juste

akelad · July 13, 2019, 5:15pm

what does your nlu config look like?

shubham · July 15, 2019, 7:26am

language: en pipeline:

name: “SpacyNLP” model: “en_core_web_lg”
name: “SpacyTokenizer”
name: “SpacyFeaturizer”
name: “RegexFeaturizer”
name: “CRFEntityExtractor”

features: [ [“low”, “title”, “upper”], [“bias”, “low”, “prefix5”, “prefix2”, “suffix5”,“digit”, “suffix3”,“suffix2”,“upper”, “title” ,“pattern”], [“low”, “title”, “upper”] ] BILOU_flag: true max_iterations: 50 L1_c: 0.1 L2_c: 0.1
name: “EntitySynonymMapper”
name: “CountVectorsFeaturizer” stop_words: [‘ourselves’, ‘hers’, ‘between’, ‘yourself’, ‘but’, ‘again’, ‘there’, ‘about’, ‘once’, ‘during’, ‘out’, ‘very’, ‘having’, ‘with’, ‘they’, ‘own’, ‘an’, ‘be’, ‘some’, ‘for’, ‘do’, ‘its’, ‘yours’, ‘such’, ‘into’, ‘of’, ‘most’, ‘itself’, ‘off’, ‘is’, ‘s’, ‘am’, ‘or’, ‘who’, ‘as’, ‘from’, ‘him’, ‘each’, ‘the’, ‘themselves’, ‘until’, ‘below’, ‘are’, ‘we’, ‘these’, ‘your’, ‘his’, ‘through’, ‘don’, ‘nor’, ‘me’, ‘were’, ‘her’, ‘more’, ‘himself’, ‘this’, ‘down’, ‘should’, ‘our’, ‘their’, ‘while’, ‘above’, ‘both’, ‘up’, ‘to’, ‘ours’, ‘had’, ‘she’, ‘all’, ‘no’, ‘when’, ‘at’, ‘any’, ‘before’, ‘them’, ‘same’, ‘and’, ‘been’, ‘have’, ‘in’, ‘will’, ‘on’, ‘does’, ‘yourselves’, ‘then’, ‘that’, ‘because’, ‘what’, ‘over’, ‘why’, ‘so’, ‘can’, ‘did’, ‘not’, ‘now’, ‘under’, ‘he’, ‘you’, ‘herself’, ‘has’, ‘just’, ‘where’, ‘too’, ‘only’, ‘myself’, ‘which’, ‘those’, ‘i’, ‘after’, ‘few’, ‘whom’, ‘t’, ‘being’, ‘if’, ‘theirs’, ‘my’, ‘against’, ‘a’, ‘by’, ‘doing’, ‘it’, ‘how’, ‘further’, ‘was’, ‘here’, ‘than’]
name: “EmbeddingIntentClassifier” intent_tokenization_flag: true intent_split_symbol: “+”

policies:

name: MemoizationPolicy
name: KerasPolicy
name: MappingPolicy

msamogh · July 17, 2019, 8:39am

I suggest you put them both under one entity called “code” or something, and later differentiate them based on the output of the intent classification. Let me know if that makes sense for your use case.

Topic		Replies	Views
Regex: Unable to extract correct entity according to Regex Rasa Open Source	4	1647	February 21, 2022
Entities can't get extracted with regex Rasa Open Source	18	1213	January 18, 2022
Rasa not recognize entity Rasa Open Source	3	634	July 29, 2019
Entity extraction Rasa Open Source	6	1319	April 9, 2019
Regex based entity Extraction Rasa Open Source	1	1028	April 30, 2020

ENTITY EXTRACTION ISSUE

intent:Transaction_details

intent:Order_details

Related topics