I have the following script.
import ray def f(document_batch): return document_batch def main(): pdf_paths = ["/path/to/my/files"] pdf_data_set = ray.data.read_binary_files(pdf_paths) pdf_data_set = pdf_data_set.map_batches(f) pdf_data_set.show() if __name__ == "__main__": main()
It works. The files in
/path/to/my/files are loaded into the dataset and shown.
If put a breakpoint on
return document_batch it is not triggered. I verified that breakpoints inside the
main function are triggered, and a
5/0 line in
f raises a divide by zero exception.
The same thing happens if I use an actor pool compute strategy.
Other times I’ve had debugging work. Still other times I’ve had different problems with the debugger.
Is this expected behavior? Is there anything I can do to work around it or further debug?
- Ray 2.3.1
- Python 3.9.16
- PyCharm 2021.2.3 Community Edition
- OS X 13.3