Hello Team,
I need to get data by calling some rest api and then need to perform some operations on it. Api call sometime takes 11 seconds and operations are cpu bound those takes few milliseconds.
Currently I kept everything in task but thinking to split getting data into async actor and computation in task.
I am running this task with single cpu and 1 gb ram.
Suppose I have to run 1000 time this task concurrently then how should I go .
Does this is correct approach or I am making some mistakes.
Yeah this is the right way to go. You can have a async actor which calls the API and invoke a task based on the result; example;
class APIReadActor:
async def run(self):
while True:
data = call_api()
await ray.get(cpu_task.remote(data))
await asyncio.sleep(0) # This is a hack to run while loop in an actor while it can still process other requests. Note that the actor is always single threaded, so if you run while loop in one of its tasks, it cannot process other tasks originally.
Another approach can be to allocate fractional CPUs to tasks, for example:
@ray.remote(num_cpus=0.1)
def io_task():
pass
This is simpler than the actor approach, but has some limitations in that you cannot reduce the number of CPUs indefinitely small, since each task still needs a worker process.
async actor shares the same event loop. Btw, spending some cpus are expected because it still needs to run some code within the actor. Are you saying the cpu consumption is high?