How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi everyone,
I am trying serve a very simple mnist model using ray serve on a google cloud machine.
Workflow
I have a python file with the MNIST deployment logic. The file is more or less as follows:
@serve.deployment()
class MNISTServer:
def __init__(self):
self.model = # code to fetch the model
async def __call__(self, request):
input_array = (await request.json())['array']
input_array = torch.tensor(input_array)
with torch.no_grad():
preds = self.model(input_array).cpu().numpy()
class_ = np.argmax(preds, axis=1)
return {"class": class_}
mnist_app = MNISTServer.bind()
The, in the command line I do the following:
ray start --head
serve build ray_serve:mnist_app -o serve_config.yaml
Which yields the yaml
file:
This file was generated using the `serve build` command on Ray v2.6.3.
proxy_location: EveryNode
http_options:
host: 0.0.0.0
port: 8000
applications:
- name: app1
route_prefix: /
import_path: ray_serve:mnist_app
runtime_env: {}
deployments:
- name: MNISTServer
Then I do:
serve deploy serve_config.yaml
2023-08-24 15:37:22,369 SUCC scripts.py:226 --
Sent deploy request successfully.
* Use `serve status` to check applications' statuses.
* Use `serve config` to see the current application config(s)
The issue is that, unlike what is written in the documentation:
host
and port
are HTTP options that determine the host IP address and the port for your Serve application’s HTTP proxies. These are optional settings and can be omitted. By default, the host
will be set to 0.0.0.0
to expose your deployments publicly, and the port will be set to 8000
. If you’re using Kubernetes, setting host
to 0.0.0.0
is necessary to expose your deployments outside the cluster.
This is not exposing the deployment publicly. When in the vm machine I do:
>>> from PIL import Image
>>> import numpy as np
>>> import requests
>>> im = Image.open('8222.png')
>>> array = (np.asarray(im)/255).reshape((1, 28, 28)).tolist()
>>> print(requests.post('http://127.0.0.1:8000', json={'array': array}).text)
{"class": [4]}
Everything works fine. But if I try to do the same (but requesting to the public IP from my laptop) I get the error:
>>> print(requests.post('http://<public ip>:8000/').text)
TimeoutError: [Errno 110] Connection timed out
I made sure to allow http trafic in the VM and, since this is my first deployment, I am not sure what I could be missing