RLlib's PolicyServer and external simulator as client

klausk55 · March 30, 2021, 2:25pm

Hello Ray community,

I use RLlib in combination with a custom external simulator. For this purpose, I use a PolicyServer on RLlib’s side and a client on external simulator’s side (HTTP server/client).
Now, my problem is that I cannot further speed up the simulation (i.e. faster call an env step and get an action) since communication between client and server currently takes about 100-300ms on average.
Time horizon in the env is several hours (or infinite) and in each step simulated time is incremented by 1s. Thus, episodes may still take a (too) long time.

Any recommendations on this dilemma?

sven1977 · March 31, 2021, 8:04am

Can you run several simulators (clients) connecting to the same server?
Or several simulators using the same client (vectorized)?

klausk55 · March 31, 2021, 8:24am

Not as yet, I can run only one instance of the simulator on a computer. But later the idea would be to run an instance of the simulator on several computers

Correct me if I’m wrong, but several instances of the simulator don’t solve this “client/server communication bottleneck”. Is there any chance to increase the throughput (e.g. use a sockets or TCP/IP communication instead of HTTP?)???

sven1977 · March 31, 2021, 9:57am

Yeah, the 100-300ms seems like a lot

Did you try using inference_mode=local to compute the action on the client side? That way, we only have to send data to the server for training, never for single action computations.

sven1977 · March 31, 2021, 10:02am

    client = PolicyClient(
        "http://" + args.server,
        inference_mode="local",
        update_interval=[every how many sec to update the client's weights from the server?])

klausk55 · March 31, 2021, 10:21am

Sorry, I forgot to mention that the client side (external simulator) runs outside Python. Otherwise, I would using PolicyClient class and the setting inference_mode='local' for sure, but simulator/client runs in C#

klausk55 · March 31, 2021, 10:40am

I’ve found out that a call of my custom NN model in forward_rnn seems to take more than 50% of the time for a complete client-server round trip. From your experience, would you say that this a normal proportion? Also, it seems that this amout of time doesn’t really scale with the number of params in NN model

sven1977 · March 31, 2021, 11:13am

Ah yes, sorry, I remember your use-case (no python on the client side )
Hmm, wouldn’t say this is not normal for a model to take up a long time to do a computation. How many parameters does your model have? Also torch or tf?

sven1977 · March 31, 2021, 11:15am

And yes, if you are not doing any batching for action computations (like parallelizing your env/simulator), this does seem like a quite inefficient setup. You are spending lots of time on a) sending a single observation through the wire (+slowness of http) and b) doing a forward pass (on a possibly large model) on a batch of size 1.

sven1977 · March 31, 2021, 11:18am

Would it be difficult to write an inference-only C# policy so you could do local inference with it and from time to time update its weights that are coming from the server?
I think this would speed up everything considerably.
Sure, we could also provide a faster client/server protocol for RLlib, maybe using msgpack w/ tcp. I’ll add this to our list of improvements (not sure whether this would make it for Q2, though).

klausk55 · March 31, 2021, 12:58pm

Yes, that’s me
TF model which is mostly shared between two policies atm. There are two transportation agents and thus at most two NN model calls per step. My custom model has four “entrance branches” with dense layers, pooling layer, concatenation, dense layer, LSTM and dense layer. A small test config has about 295k params, the “original” config has about 2.5m params (see screenshots).

rusu24edward · March 31, 2021, 9:04pm

@klausk55 have you looked into python bindings for your C# sim?

klausk55 · April 1, 2021, 7:50am

@rusu24edward Do you mean something like Python.NET or IronPython?
If so, do you have experiences with such “bindings”? I don’t have.
Additionally, I have concerns about compatibility with RLlib (i.e. can I still use RLlib?).

rusu24edward · April 5, 2021, 5:23pm

I don’t have experience with C# bindings, but I’ve experimented with C++ bindings for a simulation with RLlib, and I was able to train.

rusu24edward · April 12, 2021, 6:37pm

I accidentally moved this conversation to a personal message between klausk and me. Here’s some important follow ups for anyone reading this convo:

me: Here’s a small demo of what I did before: GitHub - rusu24edward/pybind11-demo: Demonstrates how to call a C++ class from Python using pybind11. . Basically, I recreated the simple corridor example in C++ and used pybind11 to create python bindings. I then created a driver script that trained a policy using RLlib. It was all very seamless once I got the connection between C++ and Python.

klausk: I’ve experimented with Python.NET which allows embedding Python in c# and it works really nice yet!
So far, I initialize and call a PolicyClient from c# using inference_mode=“local” which saves me from doing a server-client roundtrip each step. This is much faster to generate rollouts!
Probably, it also should go without making use of PolicyServer/Client and directly utilize ExternalEnv resp. BaseEnv class, but at this stage of prototyping I’m really happy with the current workaround
Thanks a lot for your tip, maybe it’s a milestone in further progress!

rusu24edward · April 12, 2021, 6:39pm

@sven1977, do you think it would be beneficial to include this guidance in the rllib docs for external environments? We can use and format the demo I linked to for the C++/Python bindings.

Topic		Replies	Views
ExternalEnv vs. External Application Clients? RLlib	3	556	July 12, 2021
Requesting Guidance on External Simulator RLlib	3	336	December 18, 2023
ExternalEnv in a secuential simulator running locally? And how to register the environment RLlib	4	495	February 25, 2022
Parallel workaround with client server RLlib	3	384	June 30, 2022
Possible to Pickle PolicyClient object? RLlib	4	344	September 6, 2022

RLlib's PolicyServer and external simulator as client

Related topics