Memory Issue

Using this pipeline

  • name: “SpacyNLP”
  • name: “WhitespaceTokenizer”
  • name: “SpacyFeaturizer”
  • name: “RegexFeaturizer”
  • name: “EntitySynonymMapper”
  • name: “SklearnIntentClassifier”
  • name: “CRFEntityExtractor”
  • name: “DucklingHTTPExtractor” url: “http://localhost:8000” dimensions: [“time”,“number”,“distance”,“email” , “amount-of-money”] locale: “en_GB” timezone: “Europe/London” policies:

Memory consumption increases as size of nlu data increases. Please recommend the best pipeline so that memory consumption handled

Thanks

Memory consumption increases as size of nlu data increases. Please recommend the best pipeline so that memory consumption handled

What would you consider as “memory consumption is handled”?

@Tobias_Wochinger @dakshvar22 Here crf entity extranctor using more memory than other entity extractors . how we can handle the memory consumption in aws servers if nlu data increases as it blocks the instance ?

than other entity extractors

Can you name a few examples? CRF is an actually machine learning based algorithm which has to be trained from scratch in contrast to others which are either rule-based or pretrained.

Apart from giving the machines more power, you could try lowering the amount of features you feed in the component ner_crf docs. Also lowering max_iterations might help.