Hi! I’m a student at Stanford. My teammates and I are currently working on a research project for our CS 244 class, where our assignment is to reimplement and recreate the result of a paper. We chose Ray because we really admire it, and are working on a miniaturized “Baby Ray” for us to try to recreate some of the figures in the paper!
We are particularly interested in the GCS flushing feature mentioned in the paper. Specifically, we would love to hear from the authors about how GCS flushing to disk was implemented in the release of Ray that pertains to this paper (2018) Section 5.1 GCS Flushing, which precedes the Ray v1 architecture (2020). Our main questions are:
- How was GCS flushing to disk implemented?: Did you use MySQL/SQLite? Did you use an early equivalent of object spilling whereby each individual object was stored as a file to disk? Or something else?
- How did the original implementation address cache misses when GCS flushing was implemented?
We noticed that the v1 white paper does not mention flushing, and we are curious about how GCS flushing was handled in the original Ray paper. Additionally, the v1 white paper discusses object spilling, which was introduced in v1.3, but we want to understand how these concepts were managed in the earlier implementations.
Any insights or advice on this matter would be greatly appreciated! Thank you!