I’m trying to set up multiple deployments serving different models. And I want to split some percentage of the load to one model and another part - to another model. While in older versions looks like it was possible, but in the future release this functionality (
set_traffic) would be removed.
Any ways to achieve this?
Hi, did you figure this out? I have this exact same question. @bavaria95
Hi @Oyin. Actually, I still haven’t figured out how to do this. But probably I’d have to implement this logic inside the serving class (though there are some disadvantages to this approach too)