Using ray with transformers pipeline for inference

subhamkhemka · August 19, 2021, 2:59pm

Hi

I am trying to use ray to speed up inference, can you please help me with an example ?

I am using the zero shot classification pipeline. My dataset has 400K sentences and 52 labels.

%%time
import ray
from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="typeform/distilbert-base-uncased-mnli")

ray.init()
test_name_lst_ray = ray.put(test_name_lst)
classifier_ref = ray.put(classifier)

@ray.remote
def get_label_score(classifier, sequence,i, labels, n=1):
    seq = sequence[i]
    res_dict = classifier(seq, labels)
    return res_dict

res_list = ray.get([get_label_score.remote(classifier_ref, test_name_lst_ray, i, tag_values) for i in range(100)])

I am running above code for 100 sentences but the timing is same as when I run using a loop in fact it adds about 15 seconds.

When I run on a gpu, it also takes the same time. Below is the code for gpu.

%%time
import ray
from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="typeform/distilbert-base-uncased-mnli", device=0)

ray.init()
test_name_lst_ray = ray.put(test_name_lst)
classifier_ref = ray.put(classifier)

@ray.remote(num_gpus=1)
def get_label_score(classifier, sequence,i, labels, n=1):
    seq = sequence[i]
    res_dict = classifier(seq, labels)
    return res_dict

res_list = ray.get([get_label_score.remote(classifier_ref, test_name_lst_ray, i, tag_values) for i in range(100)])

I have access to CPU with 16vcpu and 32gb ram as well as machine with 2 gpu each 12gb ram.

Please assist on how I can use ray to parallelise and speed up inference for my large dataset.

Regards,
Subham

Topic		Replies	Views
Transformers pipeline with ray does not work on gpu	0	892	September 8, 2023
[Core] Question on optimizing machine learning project speed using ray Ray Core	5	462	February 1, 2022
Tensor parallelism with torch run inside ray Ray Core	0	116	April 29, 2024
Ray Train with Ray datasets (includes images) too slow Ray Data	5	1255	February 14, 2023
Tensor parallel inference with deepspeed on ray	1	123	September 27, 2024

Using ray with transformers pipeline for inference

Related topics