How to use BERT tokenizer in ray cluster?

deepankar_chanda · December 19, 2020, 9:13am

Hello Team,

ray==0.8.4

I have 3 node cluster setup, I am doing text pre-processing where I need Bert tokenizer as well. I want to distribute it as well.
I am looking for suggestion on how to distribute it, only this piece is running on local.

I am thinking to use Ray[serve]
Distribute ray service only for tokenization. I want to avoid it for n/w overhead
Using cloudpickle

Thanks in advance.

deepankar_chanda · December 21, 2020, 8:16am

Consider it resolved.

rliaw · December 22, 2020, 8:41am

Awesome; what was your resolution?

Topic		Replies	Views
How to use BERT in ray cluster? Ray Clusters	1	691	April 20, 2021
Resources allocation during serve deployment Ray Serve	5	664	December 3, 2022
How to use Ray to train HuggingFace tokenizer in a distributed way?	0	17	July 17, 2024
Using ray for data processing	2	675	January 6, 2023
Ray Serve: Ray Serve vs Regular Web server Performance? Ray Serve	2	1258	January 5, 2022

How to use BERT tokenizer in ray cluster?

Related topics