This is a starter for a follow-up discussion of my conda-forge efforts (ending up in Ray being available on Windows and Linux via conda install -c conda-forge ray-all).
I’ve noticed that building full-fledged Ray takes quite a while for a completely “from scrach” build (about 15-20 minutes on my i7 Skylake desktop), and I have also noticed that quite a lot of time is spent building third party dependencies (like boost or protobuf) which could be taken in prebuilt form (at least when building for Conda).
What is the exact reason that dependencies are taken as source and built in-house compared to taking them in prebuilt form?
Also this poses a second question - why there’re so many dependencies in the graph? Is it possible to eliminate some?
P.S. cc @rliaw as this is a follow-up thread after our vocal discussion
@pcmoritz, can you help explain the boost and protobuf building?
RE: removing dependencies, I agree this is a huge problem. We had an internal discussion among the Ray/Anyscale team about reducing these. Specifically, we’re considering a “minimal ray install” that provides the bare minimum (without dashboard support), just for users to get Ray working out of the box.
@rliaw I believe you’re referring to Python dependencies when you pip install ray, whereas @vnlitvinov is referring to dependencies at compile time.
We’re optimizing a lot for Ray users (even more so than for Ray developers), so we care about making pip install ray work as seamlessly as possible. Not that there is a tradeoff.
Regarding building dependencies from source. There might be some low hanging fruit to speed up compilation by downloading binaries. However, the current approach is super reliable, e.g., there’s no danger that the prebuilt dependency doesn’t exist on the developer’s machine or something like that.
Regarding removing some dependencies to simplify things and speed up compilation, there’s probably more we can do here.