Eager, lazy policies and TF2 training

dlyubimov · June 15, 2021, 5:34pm

Hello,
I am trying to figure if i can run rllib with framework=tf2 config with lazy evaluation of the policies. Can I?

So far i have a set of observations w.r.t. policy kinds:

framework=tf2 mode is not default.
tf2 mode runs “eager” policies, as created by “as_eager” api, in eager (therefore slower) mode. It does not seem to create concrete functions anywhere or take care to prevent tracing churn on the same policy.
lazy policies do not work in tf2 mode at all because of compatibility issues.

Any suggestions if there’s any way (or oncoming direction) of training with concrete (lazy) functions in TF2 mode? If there’s a way, how can i do that?

Because right now it seems there’s no point to train with TF2 native at all. The most efficient policies are still the compatibility mode ones.

Thank you.

dlyubimov · June 15, 2021, 9:06pm

it seems i missed that there’s eager_tracing config parameter that seems to enable tf.funciton decoration of all eager policy methods. This seems to guarantee lazy execution of eager policies in framework=tfe mode. Am i right? is that the recommended way to work with TF2 training?

Also: how well does it cover all existing algorithms, and was it benchmarked against legacy TF1 compatibility policies? In other words, how safe you guys feel it is to switch to this mode (framework=tfe, eager_tracing=True) from the legacy TF1 policies in rllib 1.3? Thank you.

Topic		Replies	Views
Tf2 training much slower than tf? RLlib	3	429	September 28, 2021
TF eager error (Executing eagerly) Configure Algorithm, Training, Evaluation, Scaling	2	463	February 6, 2023
Wrapping existing models in rllib RLlib	0	413	March 15, 2021
Memory leak CPU RAM with Tf2 eager execution RLlib	2	456	August 3, 2021
How to save policy model? RLlib	2	376	January 5, 2021

Eager, lazy policies and TF2 training

Related topics