Expose deployments

augusto-peres · August 24, 2023, 3:53pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

Hi everyone,

I am trying serve a very simple mnist model using ray serve on a google cloud machine.

Workflow

I have a python file with the MNIST deployment logic. The file is more or less as follows:

@serve.deployment()
class MNISTServer:

    def __init__(self):
        self.model = # code to fetch the model

    async def __call__(self, request):
        input_array = (await request.json())['array']
        input_array = torch.tensor(input_array)

        with torch.no_grad():
            preds = self.model(input_array).cpu().numpy()

        class_ = np.argmax(preds, axis=1)
        return {"class": class_}


mnist_app = MNISTServer.bind()

The, in the command line I do the following:

ray start --head
serve build ray_serve:mnist_app -o serve_config.yaml

Which yields the yaml file:

 This file was generated using the `serve build` command on Ray v2.6.3.

proxy_location: EveryNode

http_options:

  host: 0.0.0.0

  port: 8000

applications:

- name: app1

  route_prefix: /

  import_path: ray_serve:mnist_app

  runtime_env: {}

  deployments:

  - name: MNISTServer

Then I do:

serve deploy serve_config.yaml 
2023-08-24 15:37:22,369 SUCC scripts.py:226 -- 
Sent deploy request successfully.
 * Use `serve status` to check applications' statuses.
 * Use `serve config` to see the current application config(s)

The issue is that, unlike what is written in the documentation:

host and port are HTTP options that determine the host IP address and the port for your Serve application’s HTTP proxies. These are optional settings and can be omitted. By default, the host will be set to 0.0.0.0 to expose your deployments publicly, and the port will be set to 8000. If you’re using Kubernetes, setting host to 0.0.0.0 is necessary to expose your deployments outside the cluster.

This is not exposing the deployment publicly. When in the vm machine I do:

>>> from PIL import Image
>>> import numpy as np
>>> import requests
>>> im = Image.open('8222.png')
>>> array = (np.asarray(im)/255).reshape((1, 28, 28)).tolist()
>>> print(requests.post('http://127.0.0.1:8000', json={'array': array}).text)
{"class": [4]}

Everything works fine. But if I try to do the same (but requesting to the public IP from my laptop) I get the error:

>>> print(requests.post('http://<public ip>:8000/').text)
TimeoutError: [Errno 110] Connection timed out

I made sure to allow http trafic in the VM and, since this is my first deployment, I am not sure what I could be missing

Jules_Damji · August 24, 2023, 4:45pm

@augusto-peres Does this work on your laptop, just curious?

asking the Serve team cc: @shrekris @Sihan_Wang

augusto-peres · August 24, 2023, 4:50pm

This works both on the VM and my laptop when requesting to the localhost

augusto-peres · August 28, 2023, 11:37am

I found out that port 8000 was not allowing http traffic. This was on port 80.

Setting a firewall rule on port 8000 to allow http traffic solves the problem.

Topic		Replies	Views
Deploying ML Model using Ray serve on K8s Ray Serve	1	1187	August 8, 2022
Can I deploy services to other machines in the cluster? Ray Serve	7	562	March 23, 2022
Deploying through serve run config. YAML can be done, but deploying through serve deploy config. YAML will fail	1	24	March 19, 2025
Always getting modulenotfound for deployment using yaml file	0	179	March 20, 2024
[Serve] Ray Serve, RayActorError: The actor died unexpectedly before finishing this task Ray Serve	1	1258	April 22, 2021

Expose deployments

Workflow

Related topics