I found a problem in my code.
My code originally
...
if ray.train.get_context().get_world_rank() == 0:
...
ray.train.report(metrics, checkpoint=checkpoint)
...
Documentation for ray.train.report states clearly that it should be called from all workers.
ray.train.report was not called from non zero rank workers.
Following change fixed my problem.
...
if ray.train.get_context().get_world_rank() == 0:
...
ray.train.report(metrics, checkpoint=checkpoint)
...
else:
ray.train.report(metrics, checkpoint=None)
...