Running microbench results in error

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I have deployed a ray cluster using ray up, which uses rayproject/ray-ml:latest-gpu docker image.

Python: 3.7.13
Ray: 2.1.0

When I execute ‘ray microbenchmark’, it results in an error at the same place.

single client get calls (Plasma Store) per second 5050.62 +- 17.63
single client put calls (Plasma Store) per second 4869.11 +- 21.22
multi client put calls (Plasma Store) per second 9596.21 +- 3025.49
single client put gigabytes per second 1.31 +- 0.5
single client tasks and get batch per second 5.46 +- 1.25
multi client put gigabytes per second 6.38 +- 0.79
single client get object containing 10k refs per second 9.31 +- 0.43
single client tasks sync per second 1083.13 +- 1.11
single client tasks async per second 5896.73 +- 985.62
multi client tasks async per second 10610.42 +- 5028.59
1:1 actor calls sync per second 780.92 +- 52.53
1:1 actor calls async per second 5888.15 +- 229.31
1:1 actor calls concurrent per second 5616.89 +- 1097.48
1:n actor calls async per second 8357.07 +- 94.2
n:n actor calls async per second 14414.82 +- 381.09
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/ray", line 8, in <module>
    sys.exit(main())
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/scripts/scripts.py", line 2596, in main
    return cli()
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/scripts/scripts.py", line 1750, in microbenchmark
    main()
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/ray_perf.py", line 228, in main
    "n:n actor calls with arg async", actor_multi2_direct_arg, n * len(clients)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/ray_microbenchmark_helpers.py", line 26, in timeit
    fn()
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/ray_perf.py", line 225, in actor_multi2_direct_arg
    ray.get([c.small_value_batch_arg.remote(n) for c in clients])
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/worker.py", line 2289, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError: ray::Client.small_value_batch_arg() (pid=51313, ip=192.168.4.87, repr=<ray._private.ray_perf.Client object at 0x7faaba6183d0>)
ray.exceptions.RayTaskError: ray::Actor.small_value_arg() (pid=40892, ip=192.168.4.99, repr=<ray._private.ray_perf.Actor object at 0x7f11a8d56510>)
  At least one of the input arguments for this task could not be computed:
ray.exceptions.OwnerDiedError: Failed to retrieve object 00289574714d49c80c3df2f12fbe6fe4c34c68ea0300000003000000. To see information about where this ObjectRef was created in Python, set the environment variable RAY_record_ref_creation_sites=1 during `ray start` and `ray.init()`.

how many cpus / memory do your cluster have?

There is just one head node and one worker node. Here is requested information:
0.0/16.0 CPU
0.00/39.697 GiB memory
0.00/18.340 GiB object_store_memory