Best way to save policy

Hi there,

I’d like to be able to save a policy (not a checkpoint) that can be loaded into a trainer later on.

I have one policy which is trained over say 10000 training iterations, and I’d like to save it every 100 to be able to run experiments on later (like in the self play paper: http://arxiv.org/abs/2006.04471). Having to load a new checkpoint each time is too slow, and the files are pretty large and come with all the other policies in the multiagent config.

I saw the policy.export_model method which creates a torchscript file but I’m not sure that’s what I want.

I guess maybe a better way is to pickle the output of policy.get_state(), then have one trainer which is initialized and use policy.set_state() with the loaded state for the experiments.

You can do something like:

    train = ImpalaTrainer(...)
    # Do some training
    ...
    torch.save(train.get_policy().model, model_path)
    ...
    train.restore(model_path)

1 Like

Traceback (most recent call last):
File “attention_net.py”, line 227, in
torch.save(trainer.get_policy().model, model_path)
File “D:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\torch\serialization.p
y”, line 372, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File “D:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\torch\serialization.p
y”, line 476, in _save
pickler.dump(obj)
_pickle.PicklingError: Can’t pickle <class ‘ray.rllib.models.catalog.FullyConnec
tedNetwork_as_AttentionWrapper’>: attribute lookup FullyConnectedNetwork_as_Atte
ntionWrapper on ray.rllib.models.catalog failed