I’m having trouble getting my stderr and stdout files for individual trials to be synced to cloud. If I run Ray Tune with a local Ray cluster, then I can see the local trials dir get populated with all logfiles including params/checkpoints as well as the stderr and stdout files. In addition, everything is present in the equivalent cloud directories I sync to.
However, if I run with my remote ray cluster, none of the log directories seem to show up on the head node, or at least nowhere that I’ve been able to find them. More confusingly, some of the files DO get synced to s3 like the params/checkpoints/errors, but not the stdout and stderr files. Then if I kubectl exec onto the running pod, I can see that the stderr and stdout files ARE getting written to the local trials dirs within the experiment dir under ray_results.
Does anyone have any suggestions on how I can fix this? I vaguely remember these getting synced before, but now I’m not sure if that behavior was only present for local ray cluster runs or not