Hi folks,
I am a little lost here. I am programming a custom policy and environment and want to train with trainer.train()
. The following code
import env
import policies
import pandas as pd
import ray
from ray.rllib.agents.trainer_template import build_trainer
df = pd.read_csv('env_data.csv')
ray.init(ignore_reinit_error=True, local_mode=True)
MyTrainer = build_trainer(
name="DeterministicPolicy",
default_policy=policies.DeterministicPolicy)
config = {
"env": env.MyEnv,
"env_config": {
"data": df,
},
"model": {
"custom_model_config": {
"memory_length": 1,
},
},
"num_workers": 1,
"framework": None,
"log_level": "DEBUG",
"create_env_on_driver": True,
"batch_mode": "complete_episodes" ,
"rollout_fragment_length": 50,
"train_batch_size": 50,
"evaluation_num_episodes": 0,
}
my_trainer = MyTrainer(config=config)
produces this error that I cannot debug and not explain:
2021-07-16 13:10:10,350 INFO services.py:1272 -- View the Ray dashboard at http://127.0.0.1:8265
2021-07-16 13:10:11,977 INFO trainer.py:671 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
Env created!!
2021-07-16 13:10:12,085 DEBUG rollout_worker.py:1160 -- Creating policy for default_policy
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0, 0, (1,), int16)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0, 0, (1,), int16)
2021-07-16 13:10:12,086 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0, 0, (1,), int16)
2021-07-16 13:10:12,087 DEBUG catalog.py:709 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x7f2f34070ee0>: Dict(price_close:Box(0.0, inf, (1,), float32), price_high:Box(0.0, inf, (1,), float32), price_low:Box(0.0, inf, (1,), float32), price_open:Box(0.0, inf, (1,), float32), tied_up_margin:Box(0.0, inf, (1,), float32), trade_count:Box(0, 0, (1,), int16), trade_count_base:Box(0, 0, (1,), int16), trade_count_quote:Box(0, 0, (1,), int16)) -> (8,)
Policy created
2021-07-16 13:10:12,087 DEBUG rollout_worker.py:698 -- Created rollout worker with env <ray.rllib.env.base_env._VectorEnvToBaseEnv object at 0x7f2f14776a30> (<MyEnv instance>), policies {'default_policy': <policies.DeterministicPolicy object at 0x7f2f14776280>}
Env created!!
2021-07-16 13:10:12,090 DEBUG rollout_worker.py:1160 -- Creating policy for default_policy
2021-07-16 13:10:12,091 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,091 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,091 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,091 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,091 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0.0, inf, (1,), float32)
2021-07-16 13:10:12,092 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0, 0, (1,), int16)
2021-07-16 13:10:12,092 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0, 0, (1,), int16)
2021-07-16 13:10:12,092 DEBUG preprocessors.py:249 -- Creating sub-preprocessor for Box(0, 0, (1,), int16)
2021-07-16 13:10:12,092 DEBUG catalog.py:709 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x7f2f14782970>: Dict(price_close:Box(0.0, inf, (1,), float32), price_high:Box(0.0, inf, (1,), float32), price_low:Box(0.0, inf, (1,), float32), price_open:Box(0.0, inf, (1,), float32), tied_up_margin:Box(0.0, inf, (1,), float32), trade_count:Box(0, 0, (1,), int16), trade_count_base:Box(0, 0, (1,), int16), trade_count_quote:Box(0, 0, (1,), int16)) -> (8,)
Policy created
2021-07-16 13:10:12,092 INFO rollout_worker.py:1199 -- Built policy map: {'default_policy': <policies.DeterministicPolicy object at 0x7f2f14782d30>}
2021-07-16 13:10:12,092 INFO rollout_worker.py:1200 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x7f2f14782970>}
2021-07-16 13:10:12,092 INFO rollout_worker.py:583 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f2f14782d00>}
2021-07-16 13:10:12,093 DEBUG rollout_worker.py:698 -- Created rollout worker with env <ray.rllib.env.base_env._VectorEnvToBaseEnv object at 0x7f2f1478e610> (<ForexEnv instance>), policies {'default_policy': <policies.DeterministicPolicy object at 0x7f2f14782d30>}
2021-07-16 13:10:12,094 WARNING util.py:53 -- Install gputil for GPU system monitoring.
Exception ignored in: <function ActorHandle.__del__ at 0x7f2f8a52c8b0>
Traceback (most recent call last):
File "/home/simon/git-projects/test-rllib/.venv/lib/python3.9/site-packages/ray/actor.py", line 823, in __del__
AttributeError: 'NoneType' object has no attribute 'global_worker'
I tried out a lot of things, like stepping manually through the environment via MyEnv.step()
and DeterministicPolicy.compute_actions()
(actually DeterministicPolicy.compute_single_action()
as I have no VectorEnv
) and this works so far, but I could not get any near to why calling the c’tor of MyTrainer
breaks.
Any help and explanation is welcome.