Hi
I am trying to use ray to speed up inference, can you please help me with an example ?
I am using the zero shot classification pipeline. My dataset has 400K sentences and 52 labels.
%%time
import ray
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="typeform/distilbert-base-uncased-mnli")
ray.init()
test_name_lst_ray = ray.put(test_name_lst)
classifier_ref = ray.put(classifier)
@ray.remote
def get_label_score(classifier, sequence,i, labels, n=1):
seq = sequence[i]
res_dict = classifier(seq, labels)
return res_dict
res_list = ray.get([get_label_score.remote(classifier_ref, test_name_lst_ray, i, tag_values) for i in range(100)])
I am running above code for 100 sentences but the timing is same as when I run using a loop in fact it adds about 15 seconds.
When I run on a gpu, it also takes the same time. Below is the code for gpu.
%%time
import ray
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="typeform/distilbert-base-uncased-mnli", device=0)
ray.init()
test_name_lst_ray = ray.put(test_name_lst)
classifier_ref = ray.put(classifier)
@ray.remote(num_gpus=1)
def get_label_score(classifier, sequence,i, labels, n=1):
seq = sequence[i]
res_dict = classifier(seq, labels)
return res_dict
res_list = ray.get([get_label_score.remote(classifier_ref, test_name_lst_ray, i, tag_values) for i in range(100)])
I have access to CPU with 16vcpu and 32gb ram as well as machine with 2 gpu each 12gb ram.
Please assist on how I can use ray to parallelise and speed up inference for my large dataset.
Regards,
Subham