I am working on a project for algorithmic trading and Black-Litterman Portfolio optimization with reinforcement learning. Here I am using RLlib for the PPO algorithm and hyperparameter optimization using Ray tune.
Link to project: GitHub - Athe-kunal/Black-Litterman-Portfolio-Optimization-using-RL
From my university cluster, I have one v100 GPU and 2 Core Xeon CPU. Here is my configuration parameters
num_workers = 1
num_samples = 20
num_gpus = 1
num_cpus = 2
training_iterations = 200
checkpoint_freq = 1
num_envs_per_worker = 100
worker_cpu = 0.5
worker_gpu = 0.5
log_level="DEBUG"
It is a small financial environment with only 206 time steps and to run the codes, you can do
python main.py --if_confidence true --model mlp
Issue that I am facing:
The ray trials are not able to utilize the hardware properly. I have only 2 core CPUs (but they are Xeon CPUs which can potentially have more workers). I am logging all my results to Weights and Biases here: Weights & Biases
In the sample_perf tab, you can see the resource utilization, where I can see a flat line. How can I ensure that I am using the hardware effectively? It is a server environment, hence I am unable to access Ray dashboard, so this weights and biases report is helpful. But as I am learning ray and rllib, can someone help me to debug and understand how can I use my resources effectively?