Does ChatGPT suggests correct config for 1 gpu and 72 cpus?

OS: Ubuntu 20.04.6 LTS
Server: Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz 72 processes
GPU: Tesla K80 (old one from 2014 year)
Python: 3.11.8
Ray: 2.37.0

What config settings should I set for fastest training PPO algorithm with dataset of ~600k rows and 300 columns? Now it takes a lot of hours/days, one episode can takes an hour.
I tried many ways, using only gpu or cpu, but not found big difference.

My config:

config = (
    PPOConfig()
    .framework("torch")
    .api_stack(
        # I set false, because of error
        enable_rl_module_and_learner=False,
        enable_env_runner_and_connector_v2=False,
    )
    .environment(
        env=select_env,
        env_config={},
    )
    # .evaluation(
    #     evaluation_num_env_runners=1,
    #     evaluation_interval=1,
    #     # Run evaluation parallel to training to speed up the example.
    #     evaluation_parallel_to_training=False,
    # )
    .env_runners(
        num_env_runners=10,# 2
        num_cpus_per_env_runner=1,
        num_envs_per_env_runner=10, # 3
        num_gpus_per_env_runner=0,
    )
    .resources(
        num_gpus=1,
        # num_cpus_per_worker=1, 
        # num_gpus_per_worker=0,
    )
    .learners(
        # num_learners=1,
        # num_cpus_per_learner=2,
        # um_gpus_per_learner=0,
    )
    .training(
        #train_batch_size_per_learner=256
        train_batch_size=5000,
    )
    .experimental(
        # _disable_preprocessor_api=False,
        # _enable_new_api_stack=True
    )
)

Config that ChatGPT advised for my case, but I got an error “TypeError: AlgorithmConfig.env_runners() got an unexpected keyword argument ‘num_workers’”.
Who knows what is rollouts here? Absolutely no docs on ray.io about it.

config = PPOConfig() \
    .environment(your_env) \
    .rollouts(
        num_workers=60,                # Adjust based on testing
        num_envs_per_worker=4,         # More parallel environments per worker
        rollout_fragment_length=400    # Balance throughput vs overhead
    ) \
    .training(
        train_batch_size=50000,        # Larger batch size for PPO
        sgd_minibatch_size=4000,       # Efficient use of GPU during SGD
        num_sgd_iter=10,               # Number of epochs per update
        lr=0.0003                      # Learning rate, can tune for your environment
    ) \
    .resources(
        num_gpus=1                     # Use the GPU
    ) \
    .evaluation(
        evaluation_num_workers=6       # Parallel evaluation workers
    )

Up. I still don’t know how to do multi processing. It’s super slow.