Hi All,
We were following the tutorial posted here: Profiling (internal) — Ray v2.0.0.dev0.
Several issues are the following.
- Environmental variable. The tutorial says to set the variable like this: export RAYLET_PERFTOOLS_PATH=/usr/lib/x86_64-linux-gnu/libprofiler.so. However, when we start ray cluster, it asks to set the variable “PERFTOOLS_PATH”. Which one is correct?
- After the ray cluster is started, run some python scripts that includes “ray.put() ray.get()”, but the file “/tmp/pprof.out” is empty. Is there something we missed?
- When we run the command “ray microbenchmark”, some data shows up in the “/tmp/pprof.out”. We tried to output it to svg file as suggested by the tutorial, using the command: google-pprof -svg $RAYLET /tmp/pprof.out > pprof.svg. It gives the error:
/usr/bin/addr2line: /usr/lib/debug/.build-id/3a/69683d31c430fad5cb0fad190a28b9570d5577.debug: unable to initialize decompress status for section .debug_aranges
/usr/bin/addr2line: /usr/lib/debug/.build-id/3a/69683d31c430fad5cb0fad190a28b9570d5577.debug: unable to initialize decompress status for section .debug_aranges
sh: 1: dot: not found
The first one, is most likely an out-of-date doc issue. I created a quick-fix PR: Update doc for profiling using the correct VARs by HuangLED · Pull Request #21561 · ray-project/ray · GitHub
for 2) and 3), looking forward to hints from ray community.
Similar to sample, there is a powerful tool called perf. Here’s a useful [link](https://www.brendangregg.com/perf.html) for a perf command.
Download perf in Docker container in product;
apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`
# Profile a running process of id 5773
perf record -p 5773
# Profile with 99 frequency for 10 seconds on PID.
# You can also create a flame graph using this command.
perf record -F 99 -p PID --call-graph dwarf sleep 10
To show perf samples
perf report
There is a cheat sheet for sampling different events of the process. https://jvns.ca/perf-cheat-sheet.pdf
You can also create a flame graph following this instruction.
Link => Search 7.1. Flame Graphs
If you’d like to collect the performance count statistics, you can also use [perf stat](https://man7.org/linux/man-pages/man1/perf-stat.1.html).
perf stat
This is the instruction that I am using to profile ray component!