We have a new utility on the latest Ray wheels that helps me uncover some issues:
from ray.util import check_serialize
# I first tried this
check_serialize.inspect_serializability(TuneMovielensModel)
# I then ran this again on the returned value
check_serialize.inspect_serializability(CandidateModel)
# I then ran it yet again on the returned value
check_serialize.inspect_serializability(MovieModel.__init__)
This showed me that the Dataset objects (like the unique_user_ids) were not pickleable.
============================================================================
Checking Serializability of <function MovieModel.__init__ at 0x7fa87c6c89d8>
============================================================================
!!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
Detected 3 global variables. Checking serializability...
Serializing 'tf' <module 'tensorflow' from '/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py'>...
Serializing 'unique_movie_titles' [b"'Til There Was You (1997)" b'1-900 (1994)' b'101 Dalmatians (1996)' ...
b'Zeus and Roxanne (1997)' b'unknown'
b'\xc3\x81 k\xc3\xb6ldum klaka (Cold Fever) (1994)']...
Serializing 'movies' <MapDataset shapes: (), types: tf.string>...
!!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
...
Serializing 'unique_movie_titles' [b"'Til There Was You (1997)" b'1-900 (1994)' b'101 Dalmatians (1996)' ...
b'Zeus and Roxanne (1997)' b'unknown'
b'\xc3\x81 k\xc3\xb6ldum klaka (Cold Fever) (1994)']...
Serializing 'movies' <MapDataset shapes: (), types: tf.string>...
!!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
Detected 1 nonlocal variables. Checking serializability...
Serializing '__class__' <class '__main__.MovieModel'>...
!!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
Serializing '_TF_MODULE_IGNORED_PROPERTIES' frozenset({'_self_unconditional_dependency_names', '_self_unconditional_checkpoint_dependencies', '_train_counter', '_predict_counter', '_obj_reference_counts_dict', '_test_counter', '_steps_per_execution'})...
============================================================================
Variable:
FailTuple(__class__ [obj=<class '__main__.MovieModel'>, parent=<function MovieModel.__init__ at 0x7fa87c6c89d8>])
FailTuple(movies [obj=<MapDataset shapes: (), types: tf.string>, parent=<function MovieModel.__init__ at 0x7fa87c6c89d8>])
FailTuple(_add_variable_with_custom_getter [obj=<bound method Trackable._add_variable_with_custom_getter of <MapDataset shapes: (), types: tf.string>>, parent=<MapDataset shapes: (), types: tf.string>])
was found to be non-serializable. There may be multiple other undetected variables that were non-serializable.
Consider either removing the instantiation/imports of these variables or moving the instantiation into the scope of the function/class.
If you have any suggestions on how to improve this error message, please reach out to the Ray developers on github.com/ray-project/ray/issues/
============================================================================
(False,
Workaround: Consider moving the dataset construction/creation into the TuneMovielensModel. Then, you can pass in the dataset objects into each of these Models via their respective __init__ methods. Let me know if that helps!