We have a preliminary, somewhat general ros2 suite that runs rllib decentralized on multiple robots. The main idea is each agent/robots has a dynamic list of neighbors it subscribes to based on arbitrary criteria (e.g. distance). This ensures scalability across tens (or hundreds?) of physical agents. The core idea is putting ros2 in charge: as sensor and neighbor data comes in, it is compiled into observations on the fly, which are sent to rllib to determine actions.
The benefit of using rllib is it makes sim2real is very straightforward. Train in your rllib+gym environment, then just write code that maps control commands (e.g. position, velocity) into physical motion for your agents. Or even better, train in the real-world! We’re hoping to make something like Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection – Google Research more mainstream. We’ve already run experiments on position and velocity control for quadrotors, turtlebots, and holonomic ground robots.
My question is this: Is the Ray team interested in applying rllib to robotics problems, or is this out of scope? There are a few things we could use support with, such as rllib wheel pipelines for ARM64 archs, leaner versions of rllib that run more quickly on SoC hardware, and general design/planning meetings to get the most out of ray.
Hey @smorad, even if we are thin-stretched for resources right now, we are of course interested in robotics applications. I guess the first question is on the ARM64 wheel? Could you ask for this here under ray core? Then someone would get back to you on whether and by when we can support these build wheels. On a slimmer, faster RLlib: I’m not sure what particular changes would make RLlib more performant on SoC devices. Could you give me your thoughts on this one? I would like to know, maybe this would be easy to support.
@smorad It’s cool to see someone working on real robots using RLlib. I have a similar idea on training with RLlib+ros+simulation then replace the simulation with real robots without changing the code/internal of the trained agent. Can you share with us a little bit how do you do that kind of setup?
We are planning on releasing the code once it’s better documented and less buggy, but I’ll share our high-level approach. As sensor data comes in, we aggregate it into an observation. Once we have a data point from each sensor, the observation is sent to a ROS inference node that calls into rllib. The resulting action from the inference node is then sent to a motion control node to translate an rllib action into actuator commands.
Thanks, I’ve just posted in ray core. I think ray tends to use a ton of memory, and ended up OOMing our 2GB raspberry pi 3’s. We’ve since bought better SoCs, but I think the public would be interested in running on various potato-level SoCs.
I suppose we are most interested in a nice, unified way for the public to use rllib on robots and wanted your thoughts. I think the ideal end-goal would be extending ExternalEnv into ROS2ExternalEnv. Handling things like aggregating and converting incoming and outgoing ROS messages into torch tensors. And then merging our code+tests into ray master so it’s less likely to break in the future.