Hi! I’m new to Ray and have been using it for some basic NLP tasks. I had a question about the general setup of memory in the system. Ideally, I’d like to have something like:
There’s a text array and some other read-only items that are fairly large but only need to be read by each worker. Then each worker does some computation and local writes that I can combine as an output later on. It’s fairly similar to map-reduce, and I’ve done the GitHub example and seen the pattern document. My confusion comes with whether or not text_array and the other read-only items are being copied over to the workers. From the readouts in !ray memory it seems like they’re not, but when I set up the futures it gives me the warning that the function has a large size when pickled, implying that they are. It would be great if they didn’t have to be copied over since they’re read-only and should be the same across all workers, so I was hoping someone could shed some light on this.
Hmmm the large pickle warnings are referring to issues when serializing the remote function. They tend to come accidentally including some large object in a function’s scope.
Hi Alex! I think you captured the heart of my issue pretty well. I created a simple example of the idea of what my code is doing below. The actual code is doing something similar to latent semantic analysis but is probably a lot harder to read. So it’s taking in a large text array and doing computation on it in parallel. You can imagine arr is the text array that ideally I don’t want to be copied over to each local worker since it’s read-only.
I’m also confused why the warning says the pickled function has size ~320MB (so almost entirely because of the array arr) while !ray memory says the much more modest ~80KB. I don’t think I’m understanding the difference and the implications of the two.
Yep, so the issue is that function definitions are stored in GCS (Redis) which is only suitable for small function definitions. In general, you should try to put the large object in the object store, then pass that object as an argument. In your case, that might look like:
Ah I had tried something similar but was lacking some conceptual understanding. I see the issue now, thanks Alex! And to confirm my understanding, ray.put is putting the object in a shared memory object store and then all the workers can reference it using the returned reference id?