Attempting to configure a RayJob via CRD, without enableInTreeAutoscaling=true
it’ll run, but not scale up based on the workload like a regular RayCluster does.
The RayJob CRD allows for spec.rayClusterSpec.enableInTreeAutoscaling
, but when deploying the RayJob CRD with this set, the Operator seems to be stuck in a loop of detecting a change, deleting all the workers back to 1 replica, then the autoscaler sidecar will attempt to scale it up, etc. etc.
What is the proper method for configuring a RayJob CRD, such that the RayCluster it creates is permitted to scale up?
Testing on Ray 2.4.0 on Python 3.9 with the official Docker image on K8s 1.24