How to enable Huawei Ascend 910B support in Rasa NLU pipelines?

Question​​:

When deploying Rasa (version 3.x) on a system with Huawei Ascend 910B NPUs, the NLP components (like transformers/DietClassifier) default to CPU execution despite hardware availability. What’s the proper way to:

  1. ​Enable NPU acceleration​​ for Rasa’s ML components?

    • Are there specific versions of PyTorch/TensorFlow that support 910B’s CANN stack?

    • Does Rasa require custom build flags or environment variables?

  2. ​Container deployment considerations​​:

    • For Docker/Kubernetes deployments, what base images include both Rasa and 910B drivers (like Ascend-CANN-toolkit)?

    • Any known issues with NPU support in Rasa’s official Docker images?

  3. ​Performance tradeoffs​​:

    • Has anyone benchmarked 910B vs GPU/CPU for Rasa’s NLU tasks?

    • Are certain pipeline components (e.g., HF transformers) more amenable to NPU acceleration than others?