Pure ray client management without init()

Hey team, I am looking into ray.init function and trying to understand how I can better integrate my app (non-ray app (dag-worker) that needs to call ray remote functions on the remote cluster (use ray internal gRPC communication), so just simple ray-client integration). The way ray.init works is to spin up global client and then trying to manage everything for me, but it actually fails when I do have like 100+ workers with ray.init in each of it, I am getting just a ton of random disconnects and other things. Is there a way for easier getting ServeControllerClient instead of relying on the whole boilerplate for ray.init ? I want to have a full control of client management, connect/disconnect, run remote calls.

The workflow that I expect to get is, while I know the deployment name, I can get the actor and then do a remote call on it. My expectation to get a handle (DeploymentHandle) and then just call it like handle.func.remote(arg)

Current status, I was able to initiate a client like this

client = _ClientContext()
    client.connect(conn_str="inference:10001", namespace=SERVE_NAMESPACE)

    print(client.is_connected())
    print(client.api.is_initialized())
    print(client.api.list_named_actors())
    print(client.client_worker.get_actor("SERVE_REPLICA::ml#LanguageDetection#dgzaYn"))

But I can’t get the actor, it gives me this error self._worker.call_retain(id) AttributeError: 'NoneType' object has no attribute 'call_retain'

Ok, I have fully turned this in another direction, looks like with _ClientContext I am missing lots of things.


worker = Worker()

    # In this case, we only need to connect the node.
    ray_params = RayParams(
        node_ip_address=None,
        gcs_address="10.224.3.52:6379",
        object_ref_seed=None,
        redis_address=None,
        redis_password=None,
        temp_dir="./ray",
        _system_config={},
        enable_object_reconstruction=False,
        metrics_export_port=None,
    )

    node = Node(
        ray_params,
        head=False,
        shutdown_at_exit=False,
        spawn_reaper=False,
        connect_only=True,
    )

    connect(
        node,
        node.session_name,
        mode=0, #SCRIPT_MODE = 0
        log_to_driver=True,
        worker=worker,
        job_id=None,
        namespace=SERVE_NAMESPACE,
        job_config=ray.job_config.JobConfig(),
        entrypoint=ray._private.utils.get_entrypoint_name(),
    )

Getting this error: 2024-05-02 16:19:49,938 INFO node.py:1001 -- Can't find a `node_ip_address.json` file from ray/session_2024-05-01_14-56-44_672413_8. Have you started Ray instsance using `ray start` or `ray.init`?

If I give node_ip_address, like this:

RayParams(
        node_ip_address="10.224.3.52",
        gcs_address="10.224.3.52:6379",

--------> [2024-05-02 16:24:11,152 C 99963 4993918] raylet_client.cc:60: Could not connect to socket /tmp/ray/session_2024-05-01_14-56-44_672413_8/sockets/raylet