Mehdi
August 11, 2022, 2:57pm
1
High: It blocks me to complete my task.
Hi all,
Sorry if my question is basic.
My environment has arguments for initialization,
like Myenv (arg1,arg2) .
How can I define “env_config” in Tune so that my model trains with different arg1 and arg2?
Thanks
Mehdi
August 11, 2022, 3:17pm
3
thanks for the reply,
no, unfortunately. For examples in the link, environment’s arguments are fixed during the training. But I need to train my model with a different range of arguments.
Can you share the ctor of your environment? What are the things you wanna configure?
cc @kourosh
Mehdi
August 12, 2022, 8:16am
5
Thanks @xwjiang2010
This is my environment constructor:
class Energy_Trading(MultiAgentEnv):
def __init__(self,config):
self.tx_count=0
self.buy_book = []
self.sell_book = []
self.fill_book= []
self.transactions=[]
# self.timesteps=0 #*******************
self.dones = set() #*******************
#self.buyer_demand={}
#self.Capacity={}
self.dissatisfaction=2
self.total_timestep=50
agents=[]
p=config
power= np.array(p)
#power = power.astype('float32')
self.l=power.size
self.lp=(power>0).sum()
self.ln=self.l-self.lp
for i in range(0,self.l):
agents.append(f"agent{i}")
self.agents=agents
self.dict_agents=dict(zip(self.agents,power))
self.state=copy.deepcopy(self.dict_agents)
obs=gym.spaces.Box(low=-1, high=1, shape=(1,))
act=gym.spaces.Box(low=0, high=1, shape=(1,))
self.observation_space = gym.spaces.Dict({key: obs for key in agents})
self.action_space= gym.spaces.Dict({key: act for key in agents})
self._agent_ids = set(self.agents)
self._spaces_in_preferred_format = True
super().__init__()
In each episode, my environment should get a new config. In fact, this config variable initializes part of my environment for each episode.
I train my model using these lines of codes:
tune.register_env("MyEnv", lambda config: Energy_Trading(p))
ray.init(local_mode=True)
tune.run(
"PPO",
stop={"episode_reward_mean": 200},
config={
"disable_env_checking": True,
"env": "MyEnv",
"env_config": {
"config": p,
},
"num_gpus": 0,
"num_workers": 1,
"multiagent": {
"policies": policies,
"policy_mapping_fn": policy_mapping_fn,
},
"framework": "tf"
},
)
I’m able to define the environment config once (by passing p to env_config).
But I want my model to be trained with different configs, meaning my code should pass different p to env_config.
Mehdi
August 12, 2022, 8:25am
6
Also, if I don’t use tune and use for loop instead. I have like this:
trainer = PPOTrainer(
config={
#"observation_filter": "NoFilter",
"disable_env_checking": True,
"env": "MyEnv",
"env_config": {
"config": p,
},
"num_gpus": 0,
"num_workers": 1,
"multiagent": {
"policies": policies,
"policy_mapping_fn": policy_mapping_fn,
},
"framework": "tf"
}
)
for i in range(100):
print("STARTING A NEW EPISODE:")
result = trainer. Train()
So each episode uses the same config while I need my model trained with different configs in each episode.
I hope I was clear enough.
Hi @Mehdi it should be something like
config = {
....
"env": Energy_Trading
"env_config": {
'arg1': arg1_value,
}
}
This will be equivalent to constructing with Energy_Trading({'arg1': arg1_value})
Mehdi
August 15, 2022, 9:30am
8
Thanks @kourosh
That’s true,
in my problem arg1_value should change for each episode, but in this way, we just pass arg1_value to the environment once.
If that’s the case you should be using RLlib callbacks to modify a parameter within each episode. Trainer.train() also does not have an episode granularity.
2 Likes