BatchPredictors for TensorRT/AITemplate models

plum9 · January 20, 2023, 2:00pm

Hi,

I was looking into using Ray AIR and Ray Cluster to perform offline batch inference on a large dataset. The model is accelerated using TensorRT or AIT in some cases. To improve on cost, latency and performance I wanted to use these with Ray AIR BatchPredictors.

My understanding is that the checkpoint can only be loaded for predefined ML frameworks. But how should I go about writing a simple wrapper to load weights for models optimized by XLAs.

Or should i write code using raw actors and actorpools?

Thanks

Huaiwei_Sun · February 3, 2023, 6:20am

It’s better to ask this question in Ray AIR category.

cc: @kai @matthewdeng

Topic		Replies	Views
Saving ray model to tf/pytorch Checkpointing, Restoring	0	297	August 11, 2023
[Predicting] TensorflowPredictor throws warning that parallelisation will be reduced to 1 Ray Serve	7	649	January 28, 2023
Does preprocessor get "glued" to the trained model artifact?	8	348	October 20, 2022
[Core] Question on optimizing machine learning project speed using ray Ray Core	5	462	February 1, 2022
Tensor parallel inference with deepspeed on ray	1	113	September 27, 2024

BatchPredictors for TensorRT/AITemplate models

Related topics