[Core] Reason behind process id is None

SumanthDatta · March 26, 2021, 4:54pm

Process id shows as None inside some of the methods executed by ray.util.sgd.data.dataset.

Dataset(
    <Parallel Iterator>,
    batch_size=batch_size,
    max_concurrency=1,
    download_func=lambda row: sample_method(row))

For example in the above snippet , inside sample_method , the process id is None when I try to print the following.

print(f"task_id: {ray.get_runtime_context().task_id}")

eoakes · March 26, 2021, 5:53pm

@Alex @sangcho any idea why task_id would be None here?

sangcho · March 27, 2021, 10:17pm

Task id is not equivalent to the process id. Use os.getpid() to get the process id.

Also, task id is None, if it doesn’t have one (e.g., it is from a driver or actor). I don’t know the internal details about the dataset API, but if it is highly likely the API is called on a driver or actors (Try ray.get_runtime_context().get() to see this).

Alex · March 27, 2021, 11:38pm

^ agree with everything Sang said. I’ll add that with dataset/parallel iterators, max_concurrency=1 will use the existing actor, while max_concurrency > 1 is needed to spin up tasks instead.

Topic		Replies	Views
About TaskID in ray Ray Core	2	442	December 8, 2022
A way to share a job id between two python processes that run ray.init()? Ray Core	1	32	February 20, 2025
AttributeError: 'NoneType' object has no attribute 'id' when using ray.util.multiprocessing pool Ray Core	0	23	November 12, 2024
I want to check task is completed or not based on some id Ray Core	0	111	April 2, 2024
JobSubmissionClient and Actor Usage Issues Ray Core	4	512	January 20, 2024

[Core] Reason behind process id is None

Related topics