Ray autoscaling despite hard limit on number of replicas

I’m not understanding why Ray is trying to upscale when I’ve explicitly set my deployment to have a max replicas of 1. Can someone explain this to me? I’m running this locally.

Here’s the deployment:

@serve.deployment(
    max_queued_requests=100,
    ray_actor_options={"num_gpus": 1.0, "num_cpus": 2.0},
    autoscaling_config=AutoscalingConfig(
        min_replicas=0,
        max_replicas=1,
        idle_timeout_minutes=5,
        upscale_delay_s=1.0,
    ),
    logging_config=LoggingConfig(log_level="ERROR"),
)

And the warning:
- Deployment 'Predictor' in application 'app1' has 9 replicas that have taken more than 30s to be scheduled.

Hi @Josiah_Reeves, can you paste the controller logs? Can you also paste the serve config? You can get the serve config by running curl http://localhost:8265/api/serve/applications/ if you’re running it locally, or if not locally then the app config from the Ray Dashboard would also help!