_local_testing_mode in serve.run

ophiryaniv-ts · February 9, 2025, 8:46pm

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello,
I’m starting to use ray.serve and want to test my deployments using _local_testing_mode=True. However, when I enable this mode, the server does not seem to start, and I get a Connection refused error when making a request.

Example:

  import requests
  from starlette.requests import Request
  from ray import serve
  from ray.serve.handle import DeploymentHandle, DeploymentResponse

  @serve.deployment
  class Doubler:
      def double(self, s: str):
          return s + " " + s
  
  @serve.deployment
  class HelloDeployment:
      def __init__(self, doubler: DeploymentHandle):
          self.doubler = doubler
  
      async def say_hello_twice(self, name: str):
          return await self.doubler.double.remote(f"Hello, {name}!")
  
      async def __call__(self, request: Request):
          return await self.say_hello_twice(request.query_params["name"])
  
  app = HelloDeployment.bind(Doubler.bind())

Using the standard mode (without _local_testing_mode), the following works correctly:

  serve.run(app)
  res = requests.get("http://127.0.0.1:8000", params={"name": "Ray"})
  print(res.text)

I am getting :

2025-02-09 22:41:14,266	INFO worker.py:1832 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
(ProxyActor pid=64461) INFO 2025-02-09 22:41:16,992 proxy 127.0.0.1 -- Proxy starting on node d5b3d23cd64c84d7b2415e11a0fd8d7437a5a119d9d6eda190d26164 (HTTP port: 8000).
(ProxyActor pid=64461) INFO 2025-02-09 22:41:17,036 proxy 127.0.0.1 -- Got updated endpoints: {}.
INFO 2025-02-09 22:41:17,041 serve 64441 -- Started Serve in namespace "serve".
(ServeController pid=64460) INFO 2025-02-09 22:41:17,140 controller 64460 -- Deploying new version of Deployment(name='Doubler', app='default') (initial target replicas: 1).
(ServeController pid=64460) INFO 2025-02-09 22:41:17,141 controller 64460 -- Deploying new version of Deployment(name='HelloDeployment', app='default') (initial target replicas: 1).
(ProxyActor pid=64461) INFO 2025-02-09 22:41:17,142 proxy 127.0.0.1 -- Got updated endpoints: {Deployment(name='HelloDeployment', app='default'): EndpointInfo(route='/', app_is_cross_language=False)}.
(ServeController pid=64460) INFO 2025-02-09 22:41:17,243 controller 64460 -- Adding 1 replica to Deployment(name='Doubler', app='default').
(ServeController pid=64460) INFO 2025-02-09 22:41:17,244 controller 64460 -- Adding 1 replica to Deployment(name='HelloDeployment', app='default').
INFO 2025-02-09 22:41:18,163 serve 64441 -- Application 'default' is ready at http://127.0.0.1:8000/.
INFO 2025-02-09 22:41:18,163 serve 64441 -- Deployed app 'default' successfully.
Hello, Ray! Hello, Ray!
(ServeReplica:default:Doubler pid=64465) /Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/ray/serve/_private/replica.py:1200: UserWarning: Calling sync method 'double' directly on the asyncio loop. In a future version, sync methods will be run in a threadpool by default. Ensure your sync methods are thread safe or keep the existing behavior by making them `async def`. Opt into the new behavior by setting RAY_SERVE_RUN_SYNC_IN_THREADPOOL=1.
(ServeReplica:default:Doubler pid=64465)   warnings.warn(
(ServeReplica:default:Doubler pid=64465) INFO 2025-02-09 22:41:18,193 default_Doubler r121jmrt 946c8101-c44e-4d39-9a98-17014da4d188 -- CALL / OK 2.4ms
(ServeReplica:default:HelloDeployment pid=64456) INFO 2025-02-09 22:41:18,194 default_HelloDeployment m6rge5c7 946c8101-c44e-4d39-9a98-17014da4d188 -- GET / 200 16.2ms

This confirms that the deployment is running, and the request is successfully handled.

Issue with _local_testing_mode=True:
When I run:

  serve.run(app, _local_testing_mode=True)
  res = requests.get("http://127.0.0.1:8000", params={"name": "Ray"})
  print(res.text)

I get the following error:

INFO 2025-02-09 22:28:54,134 local_test - -- Initializing local replica class for Deployment(name='Doubler', app='default').
INFO 2025-02-09 22:28:54,134 local_test - -- Initializing local replica class for Deployment(name='HelloDeployment', app='default').
INFO 2025-02-09 22:28:54,135 local_test - -- Deployed app 'default' successfully.
Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 415, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 244, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1276, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1322, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1271, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1031, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 969, in send
    self.connect()
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x1630ad690>: Failed to establish a new connection: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=8000): Max retries exceeded with url: /?name=Ray (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1630ad690>: Failed to establish a new connection: [Errno 61] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/app/test_ray.py", line 30, in <module>
    res = requests.get("http://127.0.0.1:8000", params={"name": "Ray"})
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8000): Max retries exceeded with url: /?name=Ray (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1630ad690>: Failed to establish a new connection: [Errno 61] Connection refused'))

Process finished with exit code 1

It seems like _local_testing_mode does not start an HTTP server.

Thanks

christina · February 12, 2025, 1:29am

Hi there! Welcome to the Ray community~

You’re right, when _local_testing_mode=True, Ray Serve doesn’t actually start an HTTP server. Instead, it runs deployments within a single process using background threads, which is useful for unit testing but not for testing HTTP endpoints. That’s why you’re getting a “Connection refused” error when trying to send requests to 127.0.0.1:8000.

If your goal is to test the logic of your deployment without starting the full Ray infrastructure, you can call the deployment handle directly in Python. For example:

handle: DeploymentHandle = serve.run(app)
response: DeploymentResponse = handle.say_hello_twice.remote(name="Ray")
print(response.result())

This way, you can verify the deployment’s behavior without relying on an HTTP request. But if you actually want to test the HTTP endpoint, you’ll need to run serve.run(app) without _local_testing_mode.

Since this mode is still experimental, if you find it limiting for your use case, you might want to open an issue or feature request on Ray’s GitHub.

Here are some relevant docs in case you’d like to do more reading:

Docs

ophiryaniv-ts · February 12, 2025, 1:07pm

Hi @christina,

Thanks for the clarification! That makes a lot more sense now.

I believe I can skip the HTTP in my case, but I’ll give it a try and update you if needed.

Really appreciate the great project and your quick response!

christina · February 12, 2025, 8:04pm

No worries!! Let me know if you have any other issues and feel free to message me anytime.

ophiryaniv-ts · March 11, 2025, 12:39pm

Hey @christina, I have a follow-up question.

I’m using a config.yaml and an app_builder to run my app.

When I enable the _local_testing_mode flag for testing, it seems like the deployments are not being configured with my user_config parameters.

For example, if my deployment’s user_config includes a model_weights_path (which should be loaded into the model), it doesn’t seem to get applied when running in _local_testing_mode.

Any insights on this? Thanks!

ophiryaniv-ts · April 6, 2025, 9:32am

@christina
I’m adding a simple example — hope this makes things clearer:

from typing import Any
from ray import serve

@serve.deployment(user_config={"name": "Ophir"})
class MostBasicIngress:
    def __init__(self) -> None:
        self.user_config = {"name": "Corey"}

    async def reconfigure(self, user_config: dict[str, Any]) -> None:
        print("Reconfiguring...")
        self.user_config = user_config

    async def __call__(self) -> str:
        name = self.user_config["name"]
        return f"Hello {name}!"


app = MostBasicIngress.bind()
handle = serve.run(app, _local_testing_mode=True)
print(handle.remote().result())

Is resulting:

INFO 2025-04-06 12:26:39,058 serve 91665 -- Initializing local replica class for Deployment(name='MostBasicIngress', app='default').
Hello Corey!

christina · April 8, 2025, 7:58pm

Hi @ophiryaniv-ts , let me take a look in running this locally and see if there’s anything I can do!

christina · April 8, 2025, 8:04pm

Just to be clear, you do not run into this issue when you turn _local_testing_mode off? This only happens when you do it in local testing omde?

ophiryaniv-ts · April 9, 2025, 7:01am

@christina thanks for the replay.
Yes, only when the _local_testing_mode is on.

christina · April 11, 2025, 10:38pm

Hi! I was doing some research and it looks like this was indeed a bug that was resolved a few days ago.

PR: support user config for local testing mode by zcin · Pull Request #52052 · ray-project/ray · GitHub

This PR “Adds support for user_config in local testing mode.”

This is probably going to be updated in the next release of Ray, but in the meantime feel free to take a look at the PR if you want to try to implement it locally in your app. Thank you for bringing it to our attention

Topic		Replies	Views
How to submit a job to a local_mode cluster	3	560	February 27, 2021
Dumb but essential questions thread Ray Serve	7	564	January 19, 2022
Possible bug in ray 2.3.1: Setting max_calls=1 for a method and local_mode=True leads to a ValueError Ray Core	3	778	May 1, 2023
What is local mode? Ray Core	2	2579	June 9, 2023
Ray serve example does not work Ray Serve	5	2201	February 11, 2021

_local_testing_mode in serve.run

Related topics