Slow training, learn_time_ms

I’m trying to use PPO to train a simple multiagent environment, but I feel that the time per iteration is too slow. My learn_time_ms metric is usually around 4000 ms, but I am only training a small network fc [64, 64]. Training is done on cpu cluster. Is this time per iteration normal?