Does the URL of the Duckling-Server-URL become a hard coded value of the model?
I ask because a change to the URL with a subsequent call to “rasa train” results in a new model, whereas a call to “rasa train” (without changing the URL) results in a “Nothing changed”-message.
I couldn’t find anything about it in the documentation. Having the Duckling URL as hard-coded component of the model would be very complicated for deployment (f. e. the port used during testing may not be available within a production environment).
If the URL changes then the output of your model should not change. From a modelling perspective you are correct to say “no change has been made”.
However, Rasa uses the files on disk as a checking mechanism to see if there’s been a change locally. I don’t know 100% for sure, but I think it is using a filehash to determine if the config.yml/nlu.md/stories.md/domain.yml files changed. If a change is detected, this is picked up as a reason to retrain. You have a similar issue when you would rename an action/utterance. It is picked up as a new state.
If you were deploying this however it is a little bit different. Typically you’d be using Rasa X on something like kubernetes. In this case the ducking service would be run inside of it’s own container as a seperate service and you would point to the endpoint that kubernetes supplies here. There’s also a variant of this method of deployment that uses docker compose.
After glancing through the docs it does seem like you can configure the endpoint using environment variables as well in. Here’s a reference to the docker compose installation docs (you may need to scroll to the bottom).
I just tested it today. Unfortunately, it seems the Duckling-URL used during training is a hard-coded part of the model, because changing the URL afterwards (in config.yml) does not get reflected. I assume this is a bug?
Nota Bene: We (or the customer in this case) doesn’t want to use Docker.
The configuration in the config.yml should tell Rasa where it can find the Duckling service but it doesn’t run the Duckling service for you. This needs to be running beforehand.
What do you mean with
because changing the URL afterwards (in config.yml) does not get reflected
A trained model will be zipped and placed in a models folder. This will contain all the settings. So if you switch the duckling URL, you may indeed need to retrain the system.
I misunderstood you. Your last sentence was exactly the info I was looking for.
Anyways making a port (or an entire URI) a hard-coded setting doesn’t seem like a good idea to me:
In this case we pretrain the model on our machines and make it available for download. If a customer has already used the Duckling-port we provided during training the model there is no way to change the port except to train a new model (which takes some time).
I certainly agree that it is not ideal with regards to the port number.
That said, the other concern that we have is that we must ensure that we retrain the model every time another change is made to that config.yml file. We might be open to a parser that does more than a checksum on the config.yml file but it’s tricky to get that right. Especially when you consider people can write their own custom components and we cannot know upfront which parameters are static and which ones are ‘loose’.
This issue has been around for a couple of years now:
As I say in the last comment, we ended up using a big, external Duckling server that can be used in all environments. It is not an optiomal solution, but I understand that handling custom components can be tricky.