I’m wondering if in your projects you avoid training & testing locally bc of the time it takes & rather only only train + test in central env cloud resource environments to optimize for time? Currently we’re eating a lot of time training & trying to establish the best process for working. LMK what you think. Thanks.
It depends on the time it takes to train the model. I recommend training and testing locally before pushing a PR with bot changes. If the training time takes a long time and needs a GPU, then do the training as part of the CI pipeline with a GPU.
Thx for the response. Yea we’re pushing to our CI pipeline, I was looking for opportunities to optimize that as it still takes a bit of time. I’m starting to onboard more engineers that will be committing code to this project hence looking for validation that I’m approaching with best practices in mind. Thx again Stephens.
We’re using Github Actions which take us anywhere from 15-30 minutes to train. We’ve also used GCP for training & seen slightly faster speeds but for the time being we’re using Github Actions as primary SOP.