Tuning Hyperparameter of RLLIB PPO on Sagemaker

I have build a docker container of ppo algorithm of RLLIB. I am using amazon sagemaker to tune hyperparameters. Sagemaker requires metric_definition to optimize. What should I use as in metric_defination of sagemaker. I am not using RL estimator of sagemaker. Thanking in advance for your response.

Hey @Arif_Jahangir, thanks for the question!

Can you provide more info of what your setup is? How does the RLlib model hook into Sagemaker HPO? Just from my cursory understanding the metric_definition can just be set to whatever value you want to optimize, which usually would be episode_reward_mean.

Also generally RLlib plays really well with Ray Tune for HPO. Would you be able to use Tune instead of Sagemaker HPO?

Hey @amogkam thanks for your response.
I have made a docker container with the following docker file

FROM tensorflow/tensorflow:2.3.0-gpu
RUN apt-get update && apt-get install -y --no-install-recommends nginx curl

RUN pip install sagemaker-containers
RUN pip install ray[default]
RUN pip install ray[rllib]
RUN pip install torch
RUN pip install jupyterlab
RUN pip install gym
RUN pip install pandas
RUN pip install numpy
RUN pip install matplotlib
RUN pip install pickle-mixin
RUN pip install datetime
RUN pip install temp

Copies the training code inside the container

Defines train.py as script entry point

ENV PATH="/opt/ml/code:${PATH}"

COPY /ppo_amd /opt/ml/code
WORKDIR /opt/ml/code

This docker container hooks up with sagemaker and runs the program
When I try to use sagemaker hyperparameters tuner. It gives me the following error

ClientError Traceback (most recent call last)
----> 1 tuner.fit()

/usr/local/lib/python3.6/site-packages/sagemaker/tuner.py in fit(self, inputs, job_name, include_cls_metadata, estimator_kwargs, wait, **kwargs)
442 “”"
443 if self.estimator is not None:
→ 444 self._fit_with_estimator(inputs, job_name, include_cls_metadata, **kwargs)
445 else:
446 self._fit_with_estimator_dict(inputs, job_name, include_cls_metadata, estimator_kwargs)

/usr/local/lib/python3.6/site-packages/sagemaker/tuner.py in _fit_with_estimator(self, inputs, job_name, include_cls_metadata, **kwargs)
453 self._prepare_estimator_for_tuning(self.estimator, inputs, job_name, **kwargs)
454 self._prepare_for_tuning(job_name=job_name, include_cls_metadata=include_cls_metadata)
→ 455 self.latest_tuning_job = _TuningJob.start_new(self, inputs)
457 def _fit_with_estimator_dict(self, inputs, job_name, include_cls_metadata, estimator_kwargs):

/usr/local/lib/python3.6/site-packages/sagemaker/tuner.py in start_new(cls, tuner, inputs)
1507 ]
→ 1509 tuner.sagemaker_session.create_tuning_job(**tuner_args)
1510 return cls(tuner.sagemaker_session, tuner._current_job_name)

/usr/local/lib/python3.6/site-packages/sagemaker/session.py in create_tuning_job(self, job_name, tuning_config, training_config, training_config_list, warm_start_config, tags)
2027 LOGGER.info(“Creating hyperparameter tuning job with name: %s”, job_name)
2028 LOGGER.debug(“tune request: %s”, json.dumps(tune_request, indent=4))
→ 2029 self.sagemaker_client.create_hyper_parameter_tuning_job(**tune_request)
2031 def describe_tuning_job(self, job_name):

/usr/local/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
355 “%s() only accepts keyword arguments.” % py_operation_name)
356 # The “self” in this scope is referring to the BaseClient.
→ 357 return self._make_api_call(operation_name, kwargs)
359 _api_call.name = str(py_operation_name)

/usr/local/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
674 error_code = parsed_response.get(“Error”, {}).get(“Code”)
675 error_class = self.exceptions.from_code(error_code)
→ 676 raise error_class(parsed_response, operation_name)
677 else:
678 return parsed_response

ClientError: An error occurred (ValidationException) when calling the CreateHyperParameterTuningJob operation: A metric is required for this hyperparameter tuning job objective. Provide a metric in the metric definitions.

I have used metric_defination as follows

    "Name": "episode_reward_mean",
    "Regex": "episode_reward_max: ([-+]?[0-9]*[.]?[0-9]+([eE][-+]?[0-9]+)?)",

from sagemaker.tuner import HyperparameterTuner, IntegerParameter, CategoricalParameter, ContinuousParameter
hyperparameter_ranges = {

"gamma": ContinuousParameter(0.30, 0.50),
"lr": ContinuousParameter(0.0001, 0.0002),

objective_metric_name = “episode_reward_mean”
tuner = HyperparameterTuner(



This gives me the error illustrated above.

Do you mind formatting your response? It’s difficult to read as is. You can format the code via backticks.