Between:
- Low: It annoys or frustrates me for a moment.
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
I’m working on my master thesis, which is about applying reinforcement learning to a PlayStation 1 game.
This is a 30min summary of what I did:
But here is the situation.
My gymnasium environment makes a TCP connection to an emulator upon instantiating the environment.
I originally had a lot of trouble to get this working, but here was my final workaround:
my_config = DEFAULT_CONFIG_DICT
my_config["interface"] = MyGranTurismoRTGYM
my_config["time_step_duration"] = 0.05
my_config["start_obs_capture"] = 0.05
my_config["time_step_timeout_factor"] = 1.0
my_config....
def env_creator(env_config):
env = gymnasium.make("real-time-gym-v1", config=my_config)
return env # return an env instance
from ray.tune.registry import register_env
register_env("gt-rtgym-env-v1", env_creator)
ray.init()
algo = (
PPOConfig()
.environment(
env="gt-rtgym-env-v1",
disable_env_checking=True,
render_env=False,
)
...
.build()
)
# Some training loops and then..
path_to_checkpoint = algo.save()
And things work well, the agent is learning and getting better.
Now, I wish to reload this training using (based on the documentation site shows this example):
my_config = DEFAULT_CONFIG_DICT
my_config["interface"] = MyGranTurismoRTGYM
my_config["time_step_duration"] = 0.05
my_config["start_obs_capture"] = 0.05
my_config["time_step_timeout_factor"] = 1.0
def env_creator(env_config):
env = gymnasium.make("real-time-gym-v1", config=my_config)
return env # return an env instance
from ray.tune.registry import register_env
register_env("gt-rtgym-env-v1", env_creator)
ray.init()
algo = Algorithm.from_checkpoint("C:/PPO_gt-rtgym-env-v1_2023-05-13_00-31-13_kmaa6ii/checkpoint_000161")
episode_reward = 0
terminated = truncated = False
obs, info = env.reset()
while not terminated and not truncated:
action = algo.compute_single_action(obs)
obs, reward, terminated, truncated, info = env.step(action) # how to access the same env that the algo is re-attempting to create?
episode_reward += reward
I cannot use the above method, because, in my case algo = Algorithm.from_checkpoint("C:/PPO_gt-rtgym-env-v1_2023-05-13_00-31-13_kmaa6ii/checkpoint_000161")
re-instantiates an environment (and I see the agent trying to connect to the emulator via TCP).
Then obviously obs, info = env.reset()
does not work, since there is no env.
If I attempt to do: env = gymnasium.make("real-time-gym-v1")
it will also obviously also not work since this will try to instantiate a second environment (which will try to connect to another emulator).
So - question here is:
-
Is there a way for me to directly use the environment restored algorithm restore_algo’s_enviornment.reset()?
-
Is there a way for me to drop the restored environment to make use of my other one?
FYI I tried seraching a lot, on this, but it s a bit more challenging with changes in the API.