Logging / compiling conflict on multiple processes

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

  • Ray version: 2.44.0
  • Python version: 3.12.7
  • OS: Windows
  • Cloud/Infrastructure: None
  • Other libs/tools (if relevant): Pytorch

3. What happened vs. what you expected:

  • Expected: Not sure about actual behavior. Conflict may happen I guess ?
  • Actual: Conflict happens.

So the training part uses logging and writes to a log file and has nothing to do with ray in general. Log file src
Now when using ray tune since multiple processes are spawned the logging system fails and I end up with a
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
What is the solution for this case ? Write to a ray handler ? Ask ray driver to write to the rotating log file instead of the custom logging system ?
The logging should not always have to use ray system though… most of the time it just needs to write to a file when under some training process during experimenting with models.

A similar issue is also raised by torch.compile. Though the failure rate is much smaller as recompilation rarely overlap simultaneously.

torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: PermissionError: [Errno 13] Permission denied: 'C:\\Users\\EXPERI~1\\AppData\\Local\\Temp\\torchinductor_Experimental\\cache\\b228b8d4e03689de91e4ff73432ea396b2322eb4e1e7234060d52519b5ee0cce'

Reproducing the error is easy , run
python -m src.train_classifier at train_classifier