Normalizing Observations

Is there a way to “normnalize observations” automatically in RLLIB?

In particular, I am interested in centering observations around the mean, as follows:

new observation = old observation - (running mean) / running std dev

If there is a way to normalize observations, where in the “config” dictionary should it be passed and how?

Thanks!

config = PPOConfig()\
.training(lr=5e-3,num_sgd_iter=10, train_batch_size = 256)\
.framework("torch")\
.rollouts(num_rollout_workers=1)\
.resources(num_gpus=0,num_cpus_per_worker=1)\
.environment(env = env_name, env_config={
                                 "num_workers":N_CPUS - 1,
                                 "disable_env_checking":True} #env_config: arguments passed to the Env + num_workers = # number of parallel workers
                 )

Hi, I am having the same question. have you found out the answer?
best

Hi! yes, it is a matter of just applying this configuration:

"observation_filter": "MeanStdFilter",

I took the example from here:

Here is the source code, to make sure it’s applying what you need:

And this forum is where I’ve got my answers from:

Hope this heps! cheers.

Thank you very much, the problem is that I am using action masking and it will also normalize the mask. So I am having the issue in this other thread:

oh, maybe open another thread then, as I haven’t faced that one yet.

Cheers!

Hi @luzgui,

I do not have a solution for you but I do have an idea for a work around that might work.

If your mask consists of 0s and 1s then after normalization you should have two unique values in the mask maybe you could reconstruct the mask with something like:

new_mask =(old_mask==old_mask.max()).float()