_local_testing_mode in serve.run

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello,
I’m starting to use ray.serve and want to test my deployments using _local_testing_mode=True. However, when I enable this mode, the server does not seem to start, and I get a Connection refused error when making a request.

Example:

  import requests
  from starlette.requests import Request
  from ray import serve
  from ray.serve.handle import DeploymentHandle, DeploymentResponse

  @serve.deployment
  class Doubler:
      def double(self, s: str):
          return s + " " + s
  
  @serve.deployment
  class HelloDeployment:
      def __init__(self, doubler: DeploymentHandle):
          self.doubler = doubler
  
      async def say_hello_twice(self, name: str):
          return await self.doubler.double.remote(f"Hello, {name}!")
  
      async def __call__(self, request: Request):
          return await self.say_hello_twice(request.query_params["name"])
  
  app = HelloDeployment.bind(Doubler.bind())

Using the standard mode (without _local_testing_mode), the following works correctly:

  serve.run(app)
  res = requests.get("http://127.0.0.1:8000", params={"name": "Ray"})
  print(res.text)

I am getting :

2025-02-09 22:41:14,266	INFO worker.py:1832 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
(ProxyActor pid=64461) INFO 2025-02-09 22:41:16,992 proxy 127.0.0.1 -- Proxy starting on node d5b3d23cd64c84d7b2415e11a0fd8d7437a5a119d9d6eda190d26164 (HTTP port: 8000).
(ProxyActor pid=64461) INFO 2025-02-09 22:41:17,036 proxy 127.0.0.1 -- Got updated endpoints: {}.
INFO 2025-02-09 22:41:17,041 serve 64441 -- Started Serve in namespace "serve".
(ServeController pid=64460) INFO 2025-02-09 22:41:17,140 controller 64460 -- Deploying new version of Deployment(name='Doubler', app='default') (initial target replicas: 1).
(ServeController pid=64460) INFO 2025-02-09 22:41:17,141 controller 64460 -- Deploying new version of Deployment(name='HelloDeployment', app='default') (initial target replicas: 1).
(ProxyActor pid=64461) INFO 2025-02-09 22:41:17,142 proxy 127.0.0.1 -- Got updated endpoints: {Deployment(name='HelloDeployment', app='default'): EndpointInfo(route='/', app_is_cross_language=False)}.
(ServeController pid=64460) INFO 2025-02-09 22:41:17,243 controller 64460 -- Adding 1 replica to Deployment(name='Doubler', app='default').
(ServeController pid=64460) INFO 2025-02-09 22:41:17,244 controller 64460 -- Adding 1 replica to Deployment(name='HelloDeployment', app='default').
INFO 2025-02-09 22:41:18,163 serve 64441 -- Application 'default' is ready at http://127.0.0.1:8000/.
INFO 2025-02-09 22:41:18,163 serve 64441 -- Deployed app 'default' successfully.
Hello, Ray! Hello, Ray!
(ServeReplica:default:Doubler pid=64465) /Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/ray/serve/_private/replica.py:1200: UserWarning: Calling sync method 'double' directly on the asyncio loop. In a future version, sync methods will be run in a threadpool by default. Ensure your sync methods are thread safe or keep the existing behavior by making them `async def`. Opt into the new behavior by setting RAY_SERVE_RUN_SYNC_IN_THREADPOOL=1.
(ServeReplica:default:Doubler pid=64465)   warnings.warn(
(ServeReplica:default:Doubler pid=64465) INFO 2025-02-09 22:41:18,193 default_Doubler r121jmrt 946c8101-c44e-4d39-9a98-17014da4d188 -- CALL / OK 2.4ms
(ServeReplica:default:HelloDeployment pid=64456) INFO 2025-02-09 22:41:18,194 default_HelloDeployment m6rge5c7 946c8101-c44e-4d39-9a98-17014da4d188 -- GET / 200 16.2ms

This confirms that the deployment is running, and the request is successfully handled.

Issue with _local_testing_mode=True:
When I run:

  serve.run(app, _local_testing_mode=True)
  res = requests.get("http://127.0.0.1:8000", params={"name": "Ray"})
  print(res.text)

I get the following error:

INFO 2025-02-09 22:28:54,134 local_test - -- Initializing local replica class for Deployment(name='Doubler', app='default').
INFO 2025-02-09 22:28:54,134 local_test - -- Initializing local replica class for Deployment(name='HelloDeployment', app='default').
INFO 2025-02-09 22:28:54,135 local_test - -- Deployed app 'default' successfully.
Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 415, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 244, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1276, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1322, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1271, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1031, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 969, in send
    self.connect()
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x1630ad690>: Failed to establish a new connection: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=8000): Max retries exceeded with url: /?name=Ray (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1630ad690>: Failed to establish a new connection: [Errno 61] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/ray_demo/app/test_ray.py", line 30, in <module>
    res = requests.get("http://127.0.0.1:8000", params={"name": "Ray"})
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/Users/user/Projects/ray_demo/venv/lib/python3.10/site-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8000): Max retries exceeded with url: /?name=Ray (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1630ad690>: Failed to establish a new connection: [Errno 61] Connection refused'))

Process finished with exit code 1

It seems like _local_testing_mode does not start an HTTP server.

Thanks :slight_smile:

Hi there! Welcome to the Ray community~

You’re right, when _local_testing_mode=True, Ray Serve doesn’t actually start an HTTP server. Instead, it runs deployments within a single process using background threads, which is useful for unit testing but not for testing HTTP endpoints. That’s why you’re getting a “Connection refused” error when trying to send requests to 127.0.0.1:8000.

If your goal is to test the logic of your deployment without starting the full Ray infrastructure, you can call the deployment handle directly in Python. For example:

handle: DeploymentHandle = serve.run(app)
response: DeploymentResponse = handle.say_hello_twice.remote(name="Ray")
print(response.result())

This way, you can verify the deployment’s behavior without relying on an HTTP request. But if you actually want to test the HTTP endpoint, you’ll need to run serve.run(app) without _local_testing_mode.

Since this mode is still experimental, if you find it limiting for your use case, you might want to open an issue or feature request on Ray’s GitHub. :slight_smile:

Here are some relevant docs in case you’d like to do more reading:

Docs

Hi @christina,

Thanks for the clarification! That makes a lot more sense now. :slight_smile:

I believe I can skip the HTTP in my case, but I’ll give it a try and update you if needed.

Really appreciate the great project and your quick response!

No worries!! Let me know if you have any other issues and feel free to message me anytime.

Hey @christina, I have a follow-up question.

I’m using a config.yaml and an app_builder to run my app.

When I enable the _local_testing_mode flag for testing, it seems like the deployments are not being configured with my user_config parameters.

For example, if my deployment’s user_config includes a model_weights_path (which should be loaded into the model), it doesn’t seem to get applied when running in _local_testing_mode.

Any insights on this? Thanks!