How to distribute a very huge FC layer?

Hi Ray community, I’m a beginner of Ray and excited to learn it.

I want to implement a simple logistic regression model, but the number of features is very big, e.g. 2^32, so that means the torch.Linear layer would be a huge FC layer, and the input data is actually sparse arrays with feature ids.

In pytorch, we can define such a huge FC layer, but that would not be efficient.
I read about RaySGD, but looks like it only supports data parallelism, however my issue is more about model parallelism I think.

And typically I think in industry, parameter servers would be used for this use case, so I wonder how to do this in Ray?
Thanks

cc @rliaw can you follow up with him? I think this is relevant to your team