I am trying Ray Serve, I have written a sample deployment , to simulate real time environment I have used time.sleep(0.5) in my method.
Now when I call the serve deployment endpoints concurrently for 1000 invocations HttpProxyActor Cpu usage is going to 100% and HttpProxyActor is getting stopped.
What parameters would help me over here? Please reply …
def summarize(text, endpointname):
# Load model
reversedString = reverseMyText(text)
time.sleep(0.5)
return reversedString