Using a trained RL model with TFLite?

chunkyks · September 24, 2021, 7:59pm

My real endgame is running a trained RL model on a Coral [from here: https://coral.ai/ ; I have the USB variant, and it’s working fine with other things ].

As far as I can tell, the first step is to convert the model to TFLite, so that it can be compiled down to something to run on the Coral [using this tool: Edge TPU Compiler | Coral ]

So, am I chasing silly ideas? Or is there a good path from a trained RLLib model, to TFLite? Or is there some completely-other approach I’d be better off using?

Cheers!
Gary

gjoliver · September 24, 2021, 8:44pm

I’d imagine yes. What format is your exported RLlib model in right now?

chunkyks · September 24, 2021, 9:24pm

Currently I’m just using the command line tool “rllib”:

#!/bin/sh
ALG=PPO
EXPERIMENT=${ALG}
WORKDIR=rtamray_results

rllib train --run ${ALG} --env RTAM-v0 \
        --checkpoint-freq 5 --checkpoint-at-end --keep-checkpoints-num 3 \
        --experiment-name ${EXPERIMENT} --local-dir ${WORKDIR} \
	--config '{"num_workers": 4}'

Which outputs opaque “checkpoint” files:

rtamray_results/PPO$ ls -l PPO_RTAM-v0_fcf10_00000_0_2021-09-24_14-25-49/checkpoint_000015/
total 2820
-rw-rw-r-- 1 chunky chunky 2879800 Sep 24 14:27 checkpoint-15
-rw-rw-r-- 1 chunky chunky 181 Sep 24 14:27 checkpoint-15.tune_metadata
chunky@gbriggs-desktop:~/src/rtam_openai/agents/ray/rtamray_results/PPO$

I’m also a little unsure how much extra “magic” there is going on; such that, even if I were able to get a .tflite file out, what additional steps it would take to execute that model and get real actions output [and if there’s any other weird layers that would need setting up, between the observation and the first layer of the tflite model?]

Thanks,
Gary

chunkyks · September 24, 2021, 9:30pm

This is the entire class I currently am using for the forward pass:

class RayDream(DreamInterface):
    def __init__(self, dreamenv, checkpoint, navmode):
        rtam_config = with_common_config({
            "env": "RTAM-v0",
            "num_workers": 0
        })
        self.navmode = navmode
        self.dreamenv = dreamenv
        self.agent = ESTrainer(rtam_config, env="RTAM-v0")
        self.agent.restore(checkpoint)

    def get_single_action(self, obs):
        action_step = self.agent.compute_action(obs)
        return action_step

    def get_one_path(self, loc, heading_deg, deadline_t=None, moved_players={}, tgts=None):
        obs = self.dreamenv.reset(loc, heading_deg, moved_players, tgts)
        curr_t = time.time()
        path = [self.dreamenv.loc]
        ep_complete = False

        while not ep_complete and (deadline_t is None or curr_t < deadline_t):
            action_step = self.agent.compute_action(obs)
            obs, reward, ep_complete, debug = self.dreamenv.step(action_step)
            path.append(self.dreamenv.loc)
            curr_t = time.time()

        return path

Cheers!
Gary

chunkyks · September 27, 2021, 5:53pm

I added a comment to a github issue that aligns with this question:

github.com/ray-project/ray

[rllib] Exporting Ray RLlib Policy Models for simple out of the box loading and evaluation

opened 01:21AM - 25 May 20 UTC

rtolsma

enhancement P2

I have a PPO policy based model that I train with RLLib using the Ray Tune API o…n some standard gym environments (with no fancy preprocessing). I have model checkpoints saved which I can load from and restore for further training. I want to export my model for production onto a system that should ideally have no dependencies on Ray or RLLib. Is there a simple way to do this? I know that there is an interface `export_model` in the `rllib.policy.tf_policy` class, but it doesn't seem particularly easy to use. For instance, after calling `export_model('savedir')` in my training script, and in another context loading via `model = tf.saved_model.load('savedir')`, the resulting `model` object is troublesome (something like `model.signatures['serving_default'](gym_observation)` doesn't work) to feed the correct inputs into for evaluation. I'm ideally looking for a method that would allow for easy out of the box model loading and evaluation on observation objects

Any guidance would be greatly appreciated

Gary

chunkyks · October 5, 2021, 11:29pm

gentle bump

I feel a little out of my depth with this; closing the loop on how to go from a trained RLlib checkpoint to a functioning model without the ray/rllib infrastrcture is still a fairly opaque process to me.

Thanks,
Gary

antoine-galataud · July 6, 2022, 11:20am

I already added a comment to above-mentioned Github PR, but for anyone reading this thread, here is the link to a tool that transforms RLlib checkpoint to an ONNX model and can run inference using ONNX runtime: GitHub - airboxlab/rllib-fast-serve: Tools and examples to export policies trained with Ray RLlib for lightweight and fast inference

avnishn · July 6, 2022, 9:39pm

Thanks for that @antoine-galataud !

Topic		Replies	Views
[rllib] How to export the pb model? RLlib	1	260	May 20, 2021
How to save RLlib model as Onnx RLlib	2	1766	May 27, 2021
How to export policy including preprocessors RLlib	0	234	March 31, 2023
What is the recommended way to make use of a trained model? RLlib	2	374	February 8, 2022
Easiest way with example to save and load models RLlib	2	1314	August 23, 2021

Using a trained RL model with TFLite?

Related topics