How does Rolloutworker work (how is experience added to the replaybuffer?)

  • High: It blocks me to complete my task.

I know this is a very simple question, but please tell me because I am new to ray and rllib.

I am currently trying to train reinforcement learning with Soft Actor Critic using image information as input. CUROBS, which seems to store the values after passing through the conv layer as far as I can see in the shape.

Then I realized that I myself do not know where SampleBatch.CUROBS is defined.

I have learned about storing experience by looking at the following URL, but have not been able to catch up on the program due to the complexity of the content.

My question is as follows.

・Where is the behavior of saving the observations retrieved from the environment to the Replay Buffer during execution defined?
・If it is an image input, is it natural that the CUROBS are not the image data but the values after passing through the conv layer?

Sorry for the trouble, but it has resolved itself.