What is algorithm implemented by the A3C agent?

Roller44 · November 17, 2021, 6:29am

The A3C paper presented multi-threaded asynchronous versions of four algorithms, namely, one-step Sarsa, one-step Q-learning, n-step Q-learning, and advantage actor-critic.

My question is: when I train an A3C agent (e.g., by calling A3CTrainer.train()), what is the algorithm that is used by the A3C agent?

mannyv · November 17, 2021, 12:36pm

The first three you list were implemented as comparisons for the main method they were introducing which is the Asynchronous Advantage Actor-Critic (A3C)

mannyv · November 17, 2021, 1:55pm

@Roller44,

There is a nice intro to rl guide that openai has put together. Keep in mind that it is pretty myopically focused on the algorithms that were invented there.

https://spinningup.openai.com/en/latest/

There is some evidence that A3C is actually not that efficient compared to other approaches. The common guidance I usually see is that you are better off either using A2C for synchronous training or IMPALA for asynchronous training with large numbers of workers.

Roller44 · November 17, 2021, 2:45pm

Got it. Thanks very much.

Topic		Replies	Views
Multi-Agent Transformer RLlib	5	1185	September 21, 2022
How to make the A3C tutorial work? RLlib	2	394	September 27, 2021
Performance of algorithms RLlib	3	605	September 2, 2021
How to configure the neural networks in A3C? RLlib	2	502	November 10, 2021
How can i use A2C with PPO in RLLIB? RLlib	7	663	July 31, 2022

What is algorithm implemented by the A3C agent?

Related topics