RLib on multiple GPUs with framework tf2

Hi @arturn .

Thanks for your replay.

I just retested a slightly modified version of your own code example for the Impala algo - see below

Ray version: 2.3.1
Tensorflow version: 2.10.0

Both GPUs in my test setup are visiable to Tensorflow by running:

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Outputs:

2023-04-20 21:37:56.551515: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-20 21:37:56.646045: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-04-20 21:37:56.988490: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘libnvinfer.so.7’; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/novelty/miniconda3/envs/ray231/lib/
2023-04-20 21:37:56.988535: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘libnvinfer_plugin.so.7’; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/novelty/miniconda3/envs/ray231/lib/
2023-04-20 21:37:56.988541: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2023-04-20 21:37:57.505636: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.505833: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.508862: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.509017: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.509156: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.509291: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
[PhysicalDevice(name=‘/physical_device:GPU:0’, device_type=‘GPU’), PhysicalDevice(name=‘/physical_device:GPU:1’, device_type=‘GPU’)]

Modified code example:

from ray.rllib.algorithms.impala import ImpalaConfig
config = ImpalaConfig()
config = config.training(lr=0.0003, train_batch_size=512)  
config = config.framework(framework="tf2",eager_tracing=True)  
config = config.resources(num_gpus=2)  
config = config.rollouts(num_rollout_workers=16)  
print(config.to_dict())  
# Build a Algorithm object from the config and run 1 training iteration.
algo = config.build(env="CartPole-v1")  
algo.train()  

And I get this error:


Traceback (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 371, in init
config.validate()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 322, in validate
super().validate()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 795, in validate
raise ValueError(
ValueError: num_gpus > 1 not supported yet for framework=tf2!

Now when I comment out the framework line I get this:


(RolloutWorker pid=43640) 2023-04-20 21:25:33,220 WARNING env.py:166 – Your env reset() method appears to take ‘seed’ or ‘return_info’ arguments. Note that these are not yet supported in RLlib. Seeding will take place using ‘env.seed()’ and the info dict will not be returned from reset.
Exception in thread Thread-7:
Traceback (most recent call last):
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1378, in _do_call
return fn(*args)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1361, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1454, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 3 root error(s) found.
(0) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_1/Cast_6/ReadVariableOp/_60]]
(1) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_0/Maximum/_199]]
(2) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/threading.py”, line 980, in _bootstrap_inner
self.run()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/execution/learner_thread.py”, line 74, in run
self.step()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/execution/multi_gpu_learner_thread.py”, line 162, in step
default_policy_results = policy.learn_on_loaded_batch(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 1037, in learn_on_loaded_batch
results = tower_stack.optimize(self.get_session(), offset)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1261, in optimize
return sess.run(fetches, feed_dict=feed_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 968, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1191, in _run
results = self._do_run(handle, final_targets, final_fetches,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1371, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1397, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node ‘default_policy/tower_1/Reshape_5’ defined at (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
Node: ‘default_policy/tower_1/Reshape_5’
Detected at node ‘default_policy/tower_1/Reshape_5’ defined at (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
Node: ‘default_policy/tower_1/Reshape_5’
Detected at node ‘default_policy/tower_1/Reshape_5’ defined at (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
Node: ‘default_policy/tower_1/Reshape_5’
3 root error(s) found.
(0) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_1/Cast_6/ReadVariableOp/_60]]
(1) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_0/Maximum/_199]]
(2) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for ‘default_policy/tower_1/Reshape_5’:
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py”, line 150, in error_handler
return fn(*args, **kwargs)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py”, line 1176, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py”, line 199, in reshape
result = gen_array_ops.reshape(tensor, shape, name)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/ops/gen_array_ops.py”, line 8551, in reshape
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/framework/op_def_library.py”, line 797, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/framework/ops.py”, line 3800, in _create_op_internal
ret = Operation(

And the execution hangs and has to be forcibly stopped.

Any advice on the above?

I would also like to bring issue #9863 to your attention which will not run when specifying framework=“tf2”.

BR

Jorgen