newbie on ray so might be stupid question. Using ray 1.2.0 here.
I am trying to create a sklearn model from the example and deploy it on a gcp ray cluster.
gcp cluster is create using the default
And here is simple machine learning model using sklearn (called from the ray example
from ray import serve
import ray
import pickle
import requests
from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier
# Train model
iris_dataset = load_iris()
model = GradientBoostingClassifier()["data"], iris_dataset["target"])
class BoostingModel:
def __init__(self):
self.model = model
self.label_list = iris_dataset["target_names"].tolist()
async def __call__(self, starlette_request):
payload = await starlette_request.json()
print("Worker: received starlette request with data", payload)
input_vector = [
payload["sepal length"],
payload["sepal width"],
payload["petal length"],
payload["petal width"],
prediction = self.model.predict([input_vector])[0]
human_name = self.label_list[prediction]
return {"result": human_name}
if __name__ == '__main__':
ray.init(address='auto', _redis_password='5241590000000000')
# # listen on to make the HTTP server accessible from other machines.
client = serve.start()
client.create_backend("lr:v1", BoostingModel, config=serve.BackendConfig(num_replicas=2))
client.create_endpoint("iris_classifier", backend="lr:v1", route="/regressor")
The code works fine locally and then is submitted to the live gcp ray cluster using
ray submit [gcp yaml]
Then I this the following error
ray.serve.exceptions.RayServeException: Cannot scale backend to 1 replicas. Ray Serve tried to add 1 replicas but the resources only allows 0 to be added. To fix this, consider scaling to replica to 0 or add more resources to the cluster.
I’ve played around with couple of settings and still can’t seem to figure out.
For example, if i set the num_replicas=0 in BackendConfig, then the validator complaints
ensure this value is greater than 0 (type=value_error.number.not_gt; limit_value=0)
Anyone can provide some information on if I am doing the right approach to deploy a machine learning serve application on an existing cloud cluster?