I am new to ray - reinforcement learning. I am trying to build a ray-ppo algorithm which will predict cloud hardware capacity required for an org in terms of compute, memory. My reward is a simple calculation based on how much the prediction is off from utilization threshold provided by user (For example if the user provides that he wants the compute utilization to be 80%, my reward will be the difference between optimum value and predicted value).
The challenge I am facing is the model is predicting less than gym.spaces.Box - low values after few iterations and throwing error and training getting stopped.
Please find the below error which I’m getting:
ValueError: (‘Observation ({}) outside given space ({})!’, array([ 1.89, 24.12]), Box(1.0, 21186.44, (2,), float64))