I’m new to Ray and have a question which I can’t find answers to. I want to tune architecture related parameters, like number of layers, number of feature maps, etc. But these parameters affect the number and distribution of weights in the network, so different networks can’t just swap parameters. As I understand it, population based training tries to use parameters from other networks, but this would create a problem, because trading parameters would change the architecture and it wouldn’t fit with the state_dict anymore.
So I am thinking maybe a search techniques exists in raytune where each network always starts from the first epoch (like basic ASAH) but the networks are still crossbred according to their performance, like in PBT. This would work as the networks wouldn’t have to have the same weight structures.
Any help would be appreciated!!! Thank you!