How to configure the neural networks in A3C?

+++++++++++++++++++

What I want to do:

Hi, I am trying to train A3C agents in my custom environment, and I want to use Tune to find out the optimal hyperparameters of the neural networks in the A3C algorithm.

I have zero TensorFlow or PyTorch background, so the only thing I can do is to use default networks/policies in RLlib.

+++++++++++++++++++

Here are what I know:

  1. The A3C algorithm adopts the actor-critic structure, where the actor and the critic have a neural network, respectively. Furthermore, according to the A3C paper, the actor’s network and the critic’s network share some parameters.
  2. By setting up the values in an RL agent’s config dict relating to hyperparameters of the agent’s networks and then input the config dict into the “Tune.run()” method, I can modify the agent’s network (e.g., number of layers or units, etc.) and start to train agent.
  3. By setting the “model: {“use_lstm”: True}” in the A3C agent’s config dict, I can enable the A3C agent to adopt an LSTM algorithm.

+++++++++++++++++++

Here are my questions:

  • Q1: Where can I configure the neural networks in A3C? When I looked up the A3C agent’s configure dict (in the a3c.py file), there is no key relating to neural network configurations, whereas in contrast, there is a key named “hidden” in DQN’s configure dict that can be used to change the neural network in DQN.

  • Q2: What are the structures of the neural networks in the RLlib’s implementations of A3C?

  • Q3: When I enable LSTM by setting “use_lstm: True”, is the actor’s neural network or the critic’s neural network converted to LSTM?

Hi @Roller44,

Here is a link to all the model options in rllib.

The config you provide will have a key called modelthat holds a dictionary with these values.

These options are used by all of the RLlib algorithms (DQN, A3C, etc.)

Both the actor and the critic will share the lstm.

Thanks for the reply!

Follow up question:

  1. Do you have any idea what are the structures of the neural networks in the RLlib’s implementations of A3C? Based on your reply, it seems like the actor-network and the critic-network are identical.
  2. I have noticed that there is a “hiddens” key (link) in the DQN’s config, while there is a “fcnet_hiddens” key (link) in the comment model config. Can I say that in DQN, there is a network (indicated by the “hiddens” key) on the top of another network (indicated by the “fcnet_hiddens” key)?