Ray Libraries (Data, Train, Tune, Serve)

Ray Data For all questions related to Ray Data. Don’t be shy – all questions are welcome! Ray Train Ray Tune For all questions about Ray Tune. Don’t be shy - all questions welcome! Ray Serve For all questions about Ray Serve. Don’t be shy - all questions welcome!

Topic	Replies	Views	Activity
About the Ray Libraries (Data, Train, Tune, Serve) category Ray Libraries (Data, Train, Tune, Serve)	2	1041	August 6, 2024
Multiple Independent Models behind a single API endpoint? Ray Serve	0	5	January 22, 2025
Lightgbm Trainer for distribute training use too much memory Ray Train	0	1	January 22, 2025
Long initialization time to initialize_session with large scale dataset Ray Train	0	6	January 21, 2025
What's the migration path for ray.data.aggregate's Max, Mean, Min, and Std functions? Ray Data	1	20	January 13, 2025
Node fault tolerance in Ray Data Ray Data	2	14	January 10, 2025
Fault tolerance with Actors and map_batches Ray Libraries (Data, Train, Tune, Serve)	1	251	January 9, 2025
How to use a pre-trained model in ray serve? Ray Tune	0	10	January 9, 2025
Ray Data ray.exceptions.GetTimeoutError: Timed out while starting actors Ray Libraries (Data, Train, Tune, Serve)	1	12	January 8, 2025
Suppress comet warning about `None` metric when reporting no Checkpoint Ray Libraries (Data, Train, Tune, Serve)	0	6	January 8, 2025
How to disable `object_store_memory` logging? Ray Train	2	19	January 7, 2025
Executing Ray Train with PyTorch Ray Train	2	63	January 6, 2025
What’s the migration path for ray.data.datasource.tfrecords_datasource.TFRecordDatasource? Ray Data	0	11	January 5, 2025
Ray Serve - Client request Cancellation Ray Serve	0	19	January 2, 2025
Why Ray Data read tfrecord so slow Ray Data	1	89	January 2, 2025
Random permission error while checkpointing Ray Libraries (Data, Train, Tune, Serve)	2	38	December 29, 2024
Ray Tune Sync Threshold Bottleneck Ray Tune	2	30	December 25, 2024
Looking for a way to cancel ray serve task Ray Serve	4	678	December 23, 2024
ModuleNotFoundError for torch Ray Tune	2	22	December 20, 2024
Map parquet columns causes decoding error with binary data Ray Data	2	43	December 19, 2024
Ray is creating hundreds of logs files under /tmp/ray/session_latest/logs/ causing disk space issue and I/O Spikes Ray Serve	7	765	December 17, 2024
TorchTrainer fails ROCM multi gpu. Invalid device ordinal Ray Libraries (Data, Train, Tune, Serve)	5	22	December 13, 2024
Check failed: worker->GetAssignedJobId().IsNil() Ray Serve	1	21	December 11, 2024
Hyperparameter optimization on Slurm using DistributedDataParallel and mpi4py Ray Tune	3	15	December 11, 2024
Ray Serve - Observing high latencies when using custom docker image Ray Libraries (Data, Train, Tune, Serve)	0	7	December 11, 2024
Ray tune exceeding memory -- how to set limit? Ray Tune	2	932	December 10, 2024
Scaling Ray Serve efficiently Ray Serve	0	18	December 10, 2024
Dynamically serve new model via Ray Serve Ray Serve	0	15	December 7, 2024
[Data] map_batches is not respecting concurrency from the beginning Ray Libraries (Data, Train, Tune, Serve)	1	70	December 6, 2024
Customized progress_reporter Ray Libraries (Data, Train, Tune, Serve)	3	29	December 6, 2024