Memory Leak in Ray Serve 2.2.0

What is the issue?

  • Simple Ray Serve app with one deployment leaks memory.

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

We observed that a simple application with single Ray Serve deployment leaks memory.
OS: Linux
Ray version: Ray 2.2.0

How to reproduce

Start shell script start.sh:

#!/bin/bash
source activate ray-env   # environment needs only ray[serve]==2.2.0

ray start --temp-dir="./logs" --head --num-cpus 4 --num-gpus 0 --metrics-export-port=8103 --include-dashboard=false

python app.py # code for app.py below

nginx  # nginx for re-routing 

Code for app.py. To minimize side effects, we removed the dashboard and limit logging.

import logging

import ray
from ray import serve

logger = logging.getLogger("ray.serve")
logger.setLevel(logging.ERROR)


@serve.deployment(route_prefix="/test_deployment")
class Test_deployment:

    def __init__(self):
        logger.setLevel(logging.ERROR)

    async def __call__(self, request):
        return {
            'code': 200,
            'response': "Hello"}

if __name__ == '__main__':
    ray.init(address="auto", logging_level=logging.ERROR, include_dashboard=False)
    serve.start(http_options={'port': 8102}, logging_level=logging.ERROR, detached=True)
    
    Test_deployment.deploy()

We have a script which continuously sends requests to the app (synchronously so there no queueing).
After 1 hour the app memory increased by 50 MB, as can be seen from Prometheus screenshot.

Can you please help us clarify if it’s the expected behaviour?
Thanks a lot!