What am I doing wrong? (PB2) - Reusing same parameters

l-j-g · January 30, 2023, 12:47pm

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hi All, new to the Ray library and have made a lot of progress but feel as if i am missing something fundamentally in my implementation of PB2 for training of hyper-parameters for an algorithmic trading strategy (using the Jesse library).

The issue that I am facing is that the same time periods are being evaluated each training iteration, however the hyperparameters are only being perbutated/exploited/explored for some of the trials. This means that for the other trials effectively the same test is being conducted with the same set of hyper-parameters.

Can anyone provide any insight to what i’m doing wrong here? Am I using the wrong tool for the job? Any advice greatly appreciated.

So far this resembles what I have implemented:

def trainable(params):
    step = 1

    # Algorithm expects ints only so preprocess params into int
    params['p1']  = int(params['p1'])
    # ...  repeat for ~10 other hyperparameters

    # Checkpoint Loading...
    if session.get_checkpoint():
        state = session.get_checkpoint().to_dict()
        last_step = state['step'];
        step = last_step + 1;

    # The docs dont make it clear why this while loop is needed, but trials will be terminated otherwise...
    while True:
        sharpes = []
        profits =  []

        # Call the external `backtest` function to evaluate the parameters  periods 
        # The hyperparamers over a range of different time periods (~5), then aggregate the results 
        for index, candle in enumerate(test_candles_refs):
            sharpe = -10 # In case the backtest fails to complete, initialise the results to a low value
            profit = -1000
            result = backtest(ray.get(config_ref), # defines the strategy to use
                                ray.get(routes_ref), # defines the time period of candles and currency pair 
                                [],
                                ray.get(test_candles_refs[index]), # which candles to use for the sample
                                hyperparameters=params) # the hyperparameters to use for the strategy
            try:
                sharpe = result['metrics']['smart_sharpe']
                profit = result['metrics']['net_profit']
            except KeyError:
                pass
            sharpes.append(sharpe)
            profits.append(profit)

        sum_sharpe = 0
        sum_profit = 0

        for sharpe in sharpes:
            sum_sharpe = sum_sharpe + sharpe

        for profit in profits:
            sum_profit = sum_profit + profit

        checkpoint_dict = {}
        checkpoint_dict['step'] = step
        checkpoint_dict['avg_smart_sharpe'] = sum_sharpe/len(sharpes)
        checkpoint_dict['sum_profit'] = sum_profit
        checkpoint_dict['sharpes'] = sharpes
        checkpoint_dict['profits'] = profits

       # Report the performance after it has been tested for each time period. 
        checkpoint = None
        checkpoint = Checkpoint.from_dict(checkpoint_dict)
        checkpoint_dict['done'] = (sum_sharpe/len(sharpes)) > 10
        session.report(checkpoint_dict, checkpoint=checkpoint)
        step += 1

kai · February 6, 2023, 5:45pm

Hi @l-j-g,

the trainable looks good. It would be more interesting to see how you instantiate PB2 and the Tuner. Can you share that part of the code?

Also, which behavior do you expect from PB2?

Generally what happens in PB2 in a nutshell:

Say you’re running 8 trials. Then you will sample 8 random hyperparameter configurations.
The trials report their results to PB2
Every perturbation_interval steps, PB2 will check which trials performed best and which performed worst
The worst 25% of trials (so 2 trials) will be terminated. Instead, the trials will exploit the best two trials
“Exploit” means they will copy them and start from the latest checkpoint. But it doesn’t make sense to just train the same parameters twice, so they perturb some of the parameters
In PB2’s case, BayesOpt is used for that
The other trials will continue training with their hyperparameters

Does that help?

Topic		Replies	Views
PB2 seems stuck in space margins and raises exceptions with lambdas Ray Tune	13	565	January 22, 2021
[Tune PBT] Population Based Training :: Questions & Errors Ray Tune	3	1190	April 1, 2021
Ray Tune PBT - Structural Hyperparameters Ray Tune	1	20	November 15, 2024
Population based training does not train Ray Tune	4	515	July 8, 2021
Trainable too slow to initialize Ray Tune	3	1551	March 8, 2021

What am I doing wrong? (PB2) - Reusing same parameters

Related topics