How severe does this issue affect your experience of using Ray?
- None: Just asking a question out of curiosity
- Low: It annoys or frustrates me for a moment.
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
- High: It blocks me to complete my task.
I use ray.data.read_parquet to load training data (mostly tabular data, a couple of list features, E.g. x1 for 1 sample is [1,2,3,4,5]), and TorchTrainer for training on 1 worker with 1 GPU + 8 CPUs. Serveral observation I have is that:
- when I double my batch size from 4096 to 8192, the training time doesn’t change while I expected it to be roughly halved.
- when I use ray.data.read_parquet(filenames).random_sample(0.1), the training time doesn’t change while I expected it to be roughly 1/10.
Is there example or guidance I can look into to understand why and how to improve?