Normalizing Observations

Username1 · December 14, 2022, 2:53pm

Is there a way to “normnalize observations” automatically in RLLIB?

In particular, I am interested in centering observations around the mean, as follows:

new observation = old observation - (running mean) / running std dev

If there is a way to normalize observations, where in the “config” dictionary should it be passed and how?

Thanks!

config = PPOConfig()\
.training(lr=5e-3,num_sgd_iter=10, train_batch_size = 256)\
.framework("torch")\
.rollouts(num_rollout_workers=1)\
.resources(num_gpus=0,num_cpus_per_worker=1)\
.environment(env = env_name, env_config={
                                 "num_workers":N_CPUS - 1,
                                 "disable_env_checking":True} #env_config: arguments passed to the Env + num_workers = # number of parallel workers
                 )

luzgui · December 21, 2022, 4:49pm

Hi, I am having the same question. have you found out the answer?
best

Username1 · December 21, 2022, 5:14pm

Hi! yes, it is a matter of just applying this configuration:

"observation_filter": "MeanStdFilter",

I took the example from here:

Here is the source code, to make sure it’s applying what you need:

github.com

ray-project/ray/blob/60d4d5e1aaa9fde3cf541ee335e284d05e75679c/rllib/utils/filter.py

import logging
import numpy as np
import threading

logger = logging.getLogger(__name__)


class Filter:
    """Processes input, possibly statefully."""

    def apply_changes(self, other, *args, **kwargs):
        """Updates self with "new state" from other filter."""
        raise NotImplementedError

    def copy(self):
        """Creates a new object with same state as self.

        Returns:
            A copy of self.
        """

This file has been truncated. show original

And this forum is where I’ve got my answers from:

github.com/ray-project/ray

[rllib] Best practice for normalizing observations with running mean

opened 08:02AM - 10 Jul 20 UTC

closed 06:26PM - 10 Jan 21 UTC

stefanbschneider

question stale

### What is your question? I want to normalize my observations without knowin…g the exact range up front; hence, I think using a running mean for normalization would be best. I only want to apply this normalization to parts of my dict observation space. What's the recommended way to do that? The [RLlib documentation](https://docs.ray.io/en/latest/rllib-models.html#custom-preprocessors) points to [Gym wrappers](https://github.com/openai/gym/tree/master/gym/wrappers), but I didn't see any wrapper for running mean normalization of observations (possible that I missed sth). Other frameworks have their own utility class for this, eg, [OpenAI baselines](https://github.com/openai/baselines/blob/master/baselines/common/vec_env/vec_normalize.py) and [stable baselines](https://stable-baselines.readthedocs.io/en/master/guide/vec_envs.html#vecnormalize). Does RLlib have something similar that I could use out of the box? Or do I need to implement it myself (what would be the best starting point)?

Hope this heps! cheers.

luzgui · December 21, 2022, 5:53pm

Thank you very much, the problem is that I am using action masking and it will also normalize the mask. So I am having the issue in this other thread:

Username1 · December 22, 2022, 4:59pm

oh, maybe open another thread then, as I haven’t faced that one yet.

Cheers!

mannyv · December 22, 2022, 11:14pm

Hi @luzgui,

I do not have a solution for you but I do have an idea for a work around that might work.

If your mask consists of 0s and 1s then after normalization you should have two unique values in the mask maybe you could reconstruct the mask with something like:

new_mask =(old_mask==old_mask.max()).float()

Topic		Replies	Views
How to correctly apply observation normalization? RLlib	2	1487	November 19, 2022
Normalizing observations in PPO+LSTM RLlib	1	521	May 23, 2023
Meanstd filter weights storage RLlib	0	27	August 14, 2024
Observation and Reward Normalization RLlib	2	628	January 7, 2023
MeanStdFilter Observation filter also normalizes action mask RLlib	3	1014	December 21, 2022

Normalizing Observations

Related topics