Ray.data ModuleNotFoundError while debugging in pytest in PyCharm

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I cannot debug my unit tests (using pytest) with PyCharm’s interactive debugger if any of my tests require ray.data because attempting to import ray.data will result in a ModuleNotFoundError.

Running the tests normally, not in debug mode, has no such problem.

My unit tests are meant to test the correctness of my actors’ main processing methods. I encountered this weird behaviour when trying to use a parquet->dataset->dataframe as my fixture data for the tests, because this is what the actors see. I also get this error inside pandas code if I try to pd.read_pickle and the dataframe has a TensorDtype column.

I can work around this for now by replacing any TensorDtype columns with object types before pickling my test dataframe.

Hi, could you provide a minimal repro script?

cc: @Clark_Zinzow @jianxiao