How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Current I’m serving an LLM using Ray Serve on top of a FastAPI application. In order to manage chat memory and document storage I’m leveraging Llama-Index and Qdrant as a VectorDB. I’m attempting to share these objects through an state
object in the __init__
method of the ingress class like so:
@serve.deployment
@serve.ingress(gpt4o_mini_proto)
class LLMIngress:
def __init__(self) -> None:
self.session_state = {}
self._llm = OpenAI(
model="gpt-4o-mini-2024-07-18",
)
self._embed_model = OpenAIEmbedding(
model="text-embedding-3-large", dimensions=1024
)
# Load / Create Index
client = QdrantClient(
host="qdrant_prod",
port=6333,
)
aclient = AsyncQdrantClient(
host="qdrant_prod",
port=6333,
)
...
self.session_state["client"] = client
self.session_state["aclient"] = aclient
Now here’s my question and mind I haven’t been able to actually test this but I need to be sure before I push this into production. My intent was to have all replicas share the same session_state
but after combing through some of the documentation it appears that that will not be the case when there is more than one replica. So I want to ask:
- Are ingress class attributes shared between replicas or not?
- If the answer to the previous question is negative, how can I force all replicas to share the same state object?