RLib on multiple GPUs with framework tf2

  • Low: It annoys or frustrates me for a moment.

According to the algorithm documentation, algorithms like PPO, Impala etc should be able to run on multiple GPUs also when using Tensorflow.

However, when looking into AlgorithmConfig.framework it says that “tf_session_args – Configures TF for single-process operation by default.” I assume this needs to be changed to get Tensorflow to run on multiple GPUs?

If so what would that look like when running on say 2 GPUs on a single machine?

BR

Jorgen

No, we run many of our tests with tf2 on multiple GPUs.
A single process can control multiple GPUs.
It moves batches to different GPUs and spawns separate threads to compute gradients on them.

Hi @arturn .

Thanks for your replay.

I just retested a slightly modified version of your own code example for the Impala algo - see below

Ray version: 2.3.1
Tensorflow version: 2.10.0

Both GPUs in my test setup are visiable to Tensorflow by running:

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Outputs:

2023-04-20 21:37:56.551515: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-20 21:37:56.646045: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-04-20 21:37:56.988490: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘libnvinfer.so.7’; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/novelty/miniconda3/envs/ray231/lib/
2023-04-20 21:37:56.988535: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘libnvinfer_plugin.so.7’; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/novelty/miniconda3/envs/ray231/lib/
2023-04-20 21:37:56.988541: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2023-04-20 21:37:57.505636: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.505833: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.508862: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.509017: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.509156: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-20 21:37:57.509291: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
[PhysicalDevice(name=‘/physical_device:GPU:0’, device_type=‘GPU’), PhysicalDevice(name=‘/physical_device:GPU:1’, device_type=‘GPU’)]

Modified code example:

from ray.rllib.algorithms.impala import ImpalaConfig
config = ImpalaConfig()
config = config.training(lr=0.0003, train_batch_size=512)  
config = config.framework(framework="tf2",eager_tracing=True)  
config = config.resources(num_gpus=2)  
config = config.rollouts(num_rollout_workers=16)  
print(config.to_dict())  
# Build a Algorithm object from the config and run 1 training iteration.
algo = config.build(env="CartPole-v1")  
algo.train()  

And I get this error:


Traceback (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 371, in init
config.validate()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 322, in validate
super().validate()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 795, in validate
raise ValueError(
ValueError: num_gpus > 1 not supported yet for framework=tf2!

Now when I comment out the framework line I get this:


(RolloutWorker pid=43640) 2023-04-20 21:25:33,220 WARNING env.py:166 – Your env reset() method appears to take ‘seed’ or ‘return_info’ arguments. Note that these are not yet supported in RLlib. Seeding will take place using ‘env.seed()’ and the info dict will not be returned from reset.
Exception in thread Thread-7:
Traceback (most recent call last):
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1378, in _do_call
return fn(*args)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1361, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1454, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 3 root error(s) found.
(0) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_1/Cast_6/ReadVariableOp/_60]]
(1) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_0/Maximum/_199]]
(2) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/threading.py”, line 980, in _bootstrap_inner
self.run()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/execution/learner_thread.py”, line 74, in run
self.step()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/execution/multi_gpu_learner_thread.py”, line 162, in step
default_policy_results = policy.learn_on_loaded_batch(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 1037, in learn_on_loaded_batch
results = tower_stack.optimize(self.get_session(), offset)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1261, in optimize
return sess.run(fetches, feed_dict=feed_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 968, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1191, in _run
results = self._do_run(handle, final_targets, final_fetches,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1371, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/client/session.py”, line 1397, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node ‘default_policy/tower_1/Reshape_5’ defined at (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
Node: ‘default_policy/tower_1/Reshape_5’
Detected at node ‘default_policy/tower_1/Reshape_5’ defined at (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
Node: ‘default_policy/tower_1/Reshape_5’
Detected at node ‘default_policy/tower_1/Reshape_5’ defined at (most recent call last):
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
Node: ‘default_policy/tower_1/Reshape_5’
3 root error(s) found.
(0) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_1/Cast_6/ReadVariableOp/_60]]
(1) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
[[default_policy/tower_0/Maximum/_199]]
(2) INVALID_ARGUMENT: Input to reshape is a tensor with 256 values, but the requested shape has 250
[[{{node default_policy/tower_1/Reshape_5}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for ‘default_policy/tower_1/Reshape_5’:
File “/home/novelty/ray_231/ray_231_multi_gpu.py”, line 22, in
algo = config.build(env=“CartPole-v1”)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py”, line 926, in build
return algo_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 445, in init
super().init(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/tune/trainable/trainable.py”, line 169, in init
self.setup(copy.deepcopy(self.config))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala.py”, line 470, in setup
super().setup(config)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py”, line 571, in setup
self.workers = WorkerSet(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 170, in init
self._setup(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 260, in _setup
self._local_worker = self._make_worker(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py”, line 946, in _make_worker
worker = cls(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 737, in init
self._build_policy_map(policy_dict=self.policy_dict)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1984, in _build_policy_map
new_policy = create_policy_for_framework(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/utils/policy.py”, line 130, in create_policy_for_framework
return policy_class(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 313, in init
self.maybe_initialize_optimizer_and_loss()
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 678, in maybe_initialize_optimizer_and_loss
self.multi_gpu_tower_stacks = [
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 679, in
TFMultiGPUTowerStack(policy=self)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1024, in init
self._setup_device(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py”, line 1296, in _setup_device
graph_obj = self.policy_copy(device_input_slices)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 943, in copy
losses = instance._do_loss_init(SampleBatch(input_dict))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy_v2.py”, line 873, in _do_loss_init
losses = self.loss(self.model, self.dist_class, train_batch)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 374, in loss
behaviour_action_logp=make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 336, in make_time_major
return _make_time_major(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/ray/rllib/algorithms/impala/impala_tf_policy.py”, line 162, in _make_time_major
rs = tf.reshape(tensor, tf.concat([[B, T], tf.shape(tensor)[1:]], axis=0))
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py”, line 150, in error_handler
return fn(*args, **kwargs)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py”, line 1176, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py”, line 199, in reshape
result = gen_array_ops.reshape(tensor, shape, name)
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/ops/gen_array_ops.py”, line 8551, in reshape
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/framework/op_def_library.py”, line 797, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File “/home/novelty/miniconda3/envs/ray231/lib/python3.9/site-packages/tensorflow/python/framework/ops.py”, line 3800, in _create_op_internal
ret = Operation(

And the execution hangs and has to be forcibly stopped.

Any advice on the above?

I would also like to bring issue #9863 to your attention which will not run when specifying framework=“tf2”.

BR

Jorgen

Hi @Jorgen_Svane ,

My bad!
Only torch and tf1 are supported for multi-gpu.
But we are working on a new API (RLModules/Learners) that does this for tf2.
You see traces of it in the master branch already and we are approaching a tf2 impala and appo implementation.

Sorry for the confusion!