How to increase ray performance for cpu and io bound operations in a task

yraut · July 26, 2021, 5:49pm

Hello Team,
I need to get data by calling some rest api and then need to perform some operations on it. Api call sometime takes 11 seconds and operations are cpu bound those takes few milliseconds.

Currently I kept everything in task but thinking to split getting data into async actor and computation in task.

I am running this task with single cpu and 1 gb ram.
Suppose I have to run 1000 time this task concurrently then how should I go .

Does this is correct approach or I am making some mistakes.

asm582 · July 27, 2021, 1:32pm

are you trying to run tasks concurrently or in a parallel manner? if data remains unchanged have to tried adding data to ray’s object store?

yraut · July 27, 2021, 4:38pm

I am putting objects in ray object store only.

asm582 · July 27, 2021, 4:54pm

thanks, from what I know ray spawns processes to run tasks in a parallel and distributed manner. if you have a cluster.

Sharing a code snippet that may help

#spawn 1000 tasks of your_func
tasks = [your_func for _ in range(1000)]
#get results of all 1000 tasks
print(ray.get(tasks))

sangcho · July 28, 2021, 6:14pm

Yeah this is the right way to go. You can have a async actor which calls the API and invoke a task based on the result; example;

class APIReadActor:
    async def run(self):
        while True:
            data = call_api()
            await ray.get(cpu_task.remote(data))
            await asyncio.sleep(0) # This is a hack to run while loop in an actor while it can still process other requests. Note that the actor is always single threaded, so if you run while loop in one of its tasks, it cannot process other tasks originally.

ericl · August 3, 2021, 8:38pm

Another approach can be to allocate fractional CPUs to tasks, for example:

@ray.remote(num_cpus=0.1)
def io_task():
pass

This is simpler than the actor approach, but has some limitations in that you cannot reduce the number of CPUs indefinitely small, since each task still needs a worker process.

yraut · August 6, 2021, 8:16am

Thank you @sangcho . Will publish the result of this… I will be trying making this change and see how this works.

yraut · August 6, 2021, 8:17am

thank you for reply. this still consume some sort of cpu…

yraut · August 6, 2021, 8:19am

One quick question.
Does each async actor will be executed in separate event loop?..
If yes then i can create separate actor for each i/o operations…

sangcho · August 9, 2021, 10:06pm

async actor shares the same event loop. Btw, spending some cpus are expected because it still needs to run some code within the actor. Are you saying the cpu consumption is high?

Topic		Replies	Views
Is there any method to make ray task sharing with 1 cpu core? Like multithreading Ray Core	4	763	July 8, 2021
What is the best approach for long running IO tasks (pollers)? Ray Core	1	524	August 9, 2021
CPU cores, CPU threads, and scaling of Ray tasks Ray Core	1	236	June 25, 2024
Number of tasks that can run on a single node Ray Core	1	365	January 31, 2022
Tasks become slow when num of submitted task greater than num cpus Ray Core	1	317	November 23, 2021

How to increase ray performance for cpu and io bound operations in a task

Related topics