I have the following script.
import ray
def f(document_batch):
return document_batch
def main():
pdf_paths = ["/path/to/my/files"]
pdf_data_set = ray.data.read_binary_files(pdf_paths)
pdf_data_set = pdf_data_set.map_batches(f)
pdf_data_set.show()
if __name__ == "__main__":
main()
It works. The files in /path/to/my/files
are loaded into the dataset and shown.
If put a breakpoint on return document_batch
it is not triggered. I verified that breakpoints inside the main
function are triggered, and a 5/0
line in f
raises a divide by zero exception.
The same thing happens if I use an actor pool compute strategy.
Other times I’ve had debugging work. Still other times I’ve had different problems with the debugger.
Is this expected behavior? Is there anything I can do to work around it or further debug?
- Ray 2.3.1
- Python 3.9.16
- PyCharm 2021.2.3 Community Edition
- OS X 13.3