Sure, but since you do not have access to my simulator you might not be able to reproduce the error!
For data collection
import ray.tune as tune
from ray.rllib.algorithms.dqn.dqn import DQNConfig
from ray.rllib.algorithms.pg.pg import PGConfig
config = PGConfig().to_dict()
config["output"] = "/tmp/cartpole-out"
config["output_max_file_size"] = 5000000
config["env"]= "CartPole-v0" #<- everything works smoothly when i use this, but not with my own gym -env
tune.run(
"PG",
stop={"timesteps_total":4000},
config = config
)
Offline RL:
config = DQNConfig().to_dict()
config["input"] = "/tmp/cartpole-out"
config["explore"] = False
config["env"]= "CartPole-v0"
tune.run(
"DQN", #<- My custom gym env works if i use the same alg in collection and in offline training
config = config)
I could play around with some options for dummy environments, but even if I would work that out I would not be able to use any online evaulation of my offline learning…
The error gets a slightly different look when I switch to torch as a framework, maybe that can give a clue
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x2 and 96x256)