I’m running ray in python 3.8.1, the error shown as follows when a use ExperimentAnalysis to retrieve a checkpoint:
AttributeError: Can’t get attribute ‘InsufficientResourcesManager’ on <module ‘ray.tune.insufficient_resources_manager’ from ‘/.local/lib/python3.8/site-packages/ray/tune/insufficient_resources_manager.py’>
The full warning:
ray/tune/utils/serialization.py:43: DeprecationWarning: The module ray.tune.insufficient_resources_manager has been moved to ray.tune.execution.insufficient_resources_manager and the old location will be deprecated soon. Please adjust your imports to point to the new location. Example: Do a global search and replace ray.tune.insufficient_resources_manager with ray.tune.execution.insufficient_resources_manager.
import numpy as np
import pandas as pd
import torch
import os
import sys
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import math
from functools import partial
from ray import tune
from ray.air.checkpoint import Checkpoint
from ray.tune import CLIReporter
from ray.air.checkpoint import Checkpoint
from ray.tune.schedulers import ASHAScheduler
from sklearn.preprocessing import StandardScaler
if name==‘main’:
checkpoint_path = os.path.join(sys.argv[3], sys.argv[4])
res = tune.ExperimentAnalysis(experiment_checkpoint_path = checkpoint_path)
Also the rest of log messages:
line 48, in retrieve_trial
self.result = tune.ExperimentAnalysis(experiment_checkpoint_path = checkpoint_path)
File “/home/piero/.local/lib/python3.8/site-packages/ray/tune/analysis/experiment_analysis.py”, line 89, in init
self._load_checkpoints(experiment_checkpoint_path)
File “/home/piero/.local/lib/python3.8/site-packages/ray/tune/analysis/experiment_analysis.py”, line 135, in _load_checkpoints
self._load_checkpoints_from_latest(latest_checkpoint)
File “/home/piero/.local/lib/python3.8/site-packages/ray/tune/analysis/experiment_analysis.py”, line 141, in _load_checkpoints_from_latest
experiment_state = json.load(f, cls=TuneFunctionDecoder)
File “/usr/lib/python3.8/json/init.py”, line 293, in load
return loads(fp.read(),
File “/usr/lib/python3.8/json/init.py”, line 370, in loads
return cls(**kw).decode(s)
File “/usr/lib/python3.8/json/decoder.py”, line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib/python3.8/json/decoder.py”, line 353, in raw_decode
obj, end = self.scan_once(s, idx)
File “/home/piero/.local/lib/python3.8/site-packages/ray/tune/utils/serialization.py”, line 39, in object_hook
return self._from_cloudpickle(obj)
File “/home/piero/.local/lib/python3.8/site-packages/ray/tune/utils/serialization.py”, line 43, in _from_cloudpickle
return cloudpickle.loads(hex_to_binary(obj[“value”]))
Sorry for the confusion, but where is the error getting raised? In the original post I see it is referring to `InsufficientResourcesManager, while in the most recent comment it looks like the trace is a potential deserialization issue?
The error gets raised in final line: File “/home/piero/.local/lib/python3.8/site-packages/ray/tune/utils/serialization.py”, line 43, in _from_cloudpickle
return cloudpickle.loads(hex_to_binary(obj[“value”])).
The message of ‘InsufficientResourcesManager’ is shown after it. I think I’m providing the proper json file of the experiment (e.g. experiment_state-2022-08-18_17-03-13.json) according to the doc.
Probably, I’m not sure. I’ve tried running in a different machine a new experiment and it worked. If the issue is related to what you states then it’s fine, I can run other one.