How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I’m trainning a SAC agent using tune.run(),but I met following problems:
Failure # 1 (occurred at 2023-04-19_03-26-42)
e[36mray::SAC.train()e[39m (pid=1102303, ip=10.122.184.202, repr=SAC)
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 368, in train
raise skipped from exception_cause(skipped)
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 365, in train
result = self.step()
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 782, in step
results, train_iter_ctx = self._run_one_training_iteration()
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 2713, in _run_one_training_iteration
results = self.training_step()
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/algorithms/dqn/dqn.py", line 414, in training_step
self.local_replay_buffer.add(new_sample_batch)
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/utils/replay_buffers/multi_agent_replay_buffer.py", line 229, in add
self._add_to_underlying_buffer(policy_id, sample_batch, **kwargs)
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/utils/replay_buffers/multi_agent_prioritized_replay_buffer.py", line 234, in _add_to_underlying_buffer
self.replay_buffers[policy_id].add(slice, **kwargs)
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/utils/replay_buffers/replay_buffer.py", line 206, in add
warn_replay_capacity(item=batch, num_items=self.capacity / batch.count)
File "/home/wulong/anaconda3/envs/drl/lib/python3.10/site-packages/ray/rllib/utils/replay_buffers/replay_buffer.py", line 59, in warn_replay_capacity
raise ValueError(msg)
ValueError: Estimated max memory usage for replay buffer is 160.074 GB (1000000.0 batches of size 1, 160074 bytes each), available system memory is 67.225485312 GB
Why estimated max memory usage for replay buffer is so large? And what should I do to cope with it?