Hi,
tune.run completely ignores that I disable loggers argument (setting loggers=None), here is my tune.run function details:
analysis = tune.run( run_search_distributed_tune, # loggers=[CSVLogger, JsonLogger], loggers=None, name="loanrecommender_hyperparam_search_with_ray_tune", # scheduler=sched, metric="mean_accuracy", mode="max", stop={ "mean_accuracy": 0.99, "training_iteration": hyper_file['iterations'] }, num_samples=hyper_file['iterations'], resources_per_trial={ "memory": 28500 * 1024 * 1024, "gpu": 0.25 }, config=search_space, local_dir=getRootDir(), # keep_checkpoints_num=2, # Keep only the best checkpoint checkpoint_score_attr='mean_accuracy', # Metric used to compare checkpoints verbose=1 )
and exception is:
Connecting to ray cluster
2021-02-18 15:46:34,212 INFO worker.py:656 – Connecting to existing Ray cluster at address: 10.150.147.159:6379
Traceback (most recent call last):
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\record_writer.py”, line 58, in open_file
factory = REGISTERED_FACTORIES[prefix]
KeyError: ‘C’During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:/dev/clientactivity/find_best_model.py”, line 186, in
analysis = tune.run(
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\ray\tune\tune.py”, line 419, in run
runner.step()
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\ray\tune\trial_runner.py”, line 355, in step
self._callbacks.on_trial_start(
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\ray\tune\callback.py”, line 180, in on_trial_start
callback.on_trial_start(**info)
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\ray\tune\logger.py”, line 428, in on_trial_start
self.log_trial_start(trial)
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\ray\tune\logger.py”, line 636, in log_trial_start
self._trial_writer[trial] = self._summary_writer_cls(
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\writer.py”, line 275, in init
self._get_file_writer()
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\writer.py”, line 323, in _get_file_writer
self.file_writer = FileWriter(logdir=self.logdir,
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\writer.py”, line 94, in init
self.event_writer = EventFileWriter(
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\event_file_writer.py”, line 106, in init
self._ev_writer = EventsWriter(os.path.join(
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\event_file_writer.py”, line 43, in init
self._py_recordio_writer = RecordWriter(self._file_name)
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\record_writer.py”, line 176, in init
self._writer = open_file(path)
File “C:\Users\dm57337.conda\envs\py38tf\lib\site-packages\tensorboardX\record_writer.py”, line 61, in open_file
return open(path, ‘wb’)
FileNotFoundError: [Errno 2] No such file or directory: ‘C:\dev\clientactivity\loanrecommender_hyperparam_search_with_ray_tune\run_search_distributed_tune_b9fe7_00000_0_activation=tanh,batch_size=512,dot_product=False,dropout=0.55332,include_branch_concat_i_2021-02-18_15-46-41\events.out.tfevents.1613656001.TLVCMEW001410’
Actually, there are 2 problems over here:
- tune.run ignores setting ‘loggers=None’
- TBXLogger fails due to parsing my local path, when it supports only S3 & Google Cloud Storage:
https://github.com/lanpa/tensorboardX/blob/34d1616c035faaa0f3f7c9d19cb8bb4425f19939/tensorboardX/record_writer.py#L99
https://github.com/lanpa/tensorboardX/blob/34d1616c035faaa0f3f7c9d19cb8bb4425f19939/tensorboardX/record_writer.py#L114