Remove Stop words for NER_CRF?

(Datisto) #1

I wonder if I just remove stop words like all forms of articles and preposition. For language like german there you have many forms of articles like ein/eine/einer/einem and preposition like für/zu/bei/mit. For NER_CRF it would be suitable to remove those instead of training all possible sentence just changing those words?

What do you say?

The bst way to do that would be the _from_json_to_crf function in the crf extractor or is there another place better?