Actor Design: Storing object refs and retreving the object

Pballer · September 16, 2024, 4:31pm

How severe does this issue affect your experience of using Ray?

None: Asking for design help.

I have the following Actor that holds a dataframe. I don’t want the actor to hold the dataframe in memory, so I put it in the object store and then get it when needed.

It feels like an anti-pattern because I explicitly put the df in the object store in load_dataset(). Then I manually de-reference it when needed in shape().

What is the best way to design an actor that hold data and functions that reference the data?

@ray.remote
class DataSet:
    """This remote class wraps a Sklearn dataset."""

    dataset_dict = {
        'iris': load_iris,
        'wine': load_wine,
        'digits': load_digits
    }

    def __init__(self, dataset_choice):
        self.dataset_choice = dataset_choice
        self.sklearn_data_ref, self.dataset_ref = self.load_dataset(dataset_choice)

    def load_dataset(self, dataset_choice):
        load_dataset = self.dataset_dict[dataset_choice]
        sklearn_data = load_dataset()
        dataset_df = pd.DataFrame(data=sklearn_data.data, columns=sklearn_data.feature_names)
        sk_ref = ray.put(sklearn_data)
        dataset_ref = ray.put(dataset_df)

        return sk_ref, dataset_ref

    def shape(self):
        dataset = ray.get(self.dataset_ref)
        return dataset.shape

sangcho · September 22, 2024, 12:40am

Unless you pass the dataset ref to other workers, it is better just having direct reference within an actor.

Topic		Replies	Views
Loading dataset once per machine in ray cluster	1	218	December 5, 2023
How to Cache Objects in the Object Store? Ray Core	1	1449	February 21, 2023
When using a single-node cluster, how to efficiently share a dataframe (for read-only access) between ray actors/tasks? Ray Core	4	881	December 2, 2022
Many actors writing to one object Ray Core	11	484	May 19, 2021
Is it possible to share objects between different driver processes? Ray Core	1	626	July 22, 2022

Actor Design: Storing object refs and retreving the object

Related topics