@SVH This has been answered by @mannyv in the reply to a similar problem of yours, I guess.
The explore attribute for evaluation has to be set to True to achieve comparable results to training.
explore
True