Using RayTune with tensorflow_recommenders library and PBT - InternalError

Hi,

I was trying to use ray tune with tensorflow_recommenders library but it gives me constatnly error:

/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)
InternalError: Tensorflow type 21 not convertible to numpy dtype.

I figured out that the error is present even in the basic tutorial, so reproducible example can be found here:
https://colab.research.google.com/drive/1ZypDwbj6aLTK15t3QNQ04MBLsR2auNtS?usp=sharing

Any suggestions why this might happend and how to fix it?

Thank you

Hey @jakubvedral! Thanks for dropping by.

We have a new utility on the latest Ray wheels that helps me uncover some issues:

from ray.util import check_serialize

# I first tried this
check_serialize.inspect_serializability(TuneMovielensModel)
# I then ran this again on the returned value
check_serialize.inspect_serializability(CandidateModel)
# I then ran it yet again on the returned value
check_serialize.inspect_serializability(MovieModel.__init__)

This showed me that the Dataset objects (like the unique_user_ids) were not pickleable.

============================================================================
Checking Serializability of <function MovieModel.__init__ at 0x7fa87c6c89d8>
============================================================================
!!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
Detected 3 global variables. Checking serializability...
    Serializing 'tf' <module 'tensorflow' from '/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py'>...
    Serializing 'unique_movie_titles' [b"'Til There Was You (1997)" b'1-900 (1994)' b'101 Dalmatians (1996)' ...
 b'Zeus and Roxanne (1997)' b'unknown'
 b'\xc3\x81 k\xc3\xb6ldum klaka (Cold Fever) (1994)']...
    Serializing 'movies' <MapDataset shapes: (), types: tf.string>...
    !!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
...
            Serializing 'unique_movie_titles' [b"'Til There Was You (1997)" b'1-900 (1994)' b'101 Dalmatians (1996)' ...
 b'Zeus and Roxanne (1997)' b'unknown'
 b'\xc3\x81 k\xc3\xb6ldum klaka (Cold Fever) (1994)']...
            Serializing 'movies' <MapDataset shapes: (), types: tf.string>...
            !!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
        Detected 1 nonlocal variables. Checking serializability...
            Serializing '__class__' <class '__main__.MovieModel'>...
            !!! FAIL serialization: Tensorflow type 21 not convertible to numpy dtype.
        Serializing '_TF_MODULE_IGNORED_PROPERTIES' frozenset({'_self_unconditional_dependency_names', '_self_unconditional_checkpoint_dependencies', '_train_counter', '_predict_counter', '_obj_reference_counts_dict', '_test_counter', '_steps_per_execution'})...
============================================================================
Variable: 

	FailTuple(__class__ [obj=<class '__main__.MovieModel'>, parent=<function MovieModel.__init__ at 0x7fa87c6c89d8>])
FailTuple(movies [obj=<MapDataset shapes: (), types: tf.string>, parent=<function MovieModel.__init__ at 0x7fa87c6c89d8>])
FailTuple(_add_variable_with_custom_getter [obj=<bound method Trackable._add_variable_with_custom_getter of <MapDataset shapes: (), types: tf.string>>, parent=<MapDataset shapes: (), types: tf.string>])

was found to be non-serializable. There may be multiple other undetected variables that were non-serializable. 
Consider either removing the instantiation/imports of these variables or moving the instantiation into the scope of the function/class. 
If you have any suggestions on how to improve this error message, please reach out to the Ray developers on github.com/ray-project/ray/issues/
============================================================================
(False,

Workaround: Consider moving the dataset construction/creation into the TuneMovielensModel. Then, you can pass in the dataset objects into each of these Models via their respective __init__ methods. Let me know if that helps!

Dear @rliaw, can you elaborate a little more on “moving the dataset construction/creation into the TuneMovielensModel”?

I tried the following change but still, get the same error.

class TuneMovielensModel(tune.Trainable):    
    def setup(self, config, cached_train, cached_test, test_ds):                                
        self.model = MovielensModel(layer_sizes=self.config.get("layer_sizes",[32]))                
        self.model.compile(optimizer=tf.keras.optimizers.Adagrad(self.config.get("lr", 0.01)))
        self.cached_train = cached_train
        self.cached_test = cached_test
        self.test_ds = test_ds

    def step(self):        
        # train
        model.fit(
            self.cached_train,
            validation_data=self.cached_test,  
            validation_freq=5,
            epochs=50,                               
            verbose=1         
        )        
        accuracy = self.model.evaluate(test_ds, return_dict=True,verbose=0)['factorized_top_k/top_5_categorical_accuracy']
        return {"top_5_categorical_accuracy": accuracy}

this should be self.model right?