Hi,
I am trying to deploy an app via a rayservice with a Kuberay cluster but I get a module error message on the dashboard.
Here is my repository structure that I use the build a custom ray image:
The Dockerfile content:
FROM rayproject/ray-ml:2.9.2
# Set the working dir for the container to /serve_app
WORKDIR /serve_app
COPY dockerfiles/kuberay/. /serve_app
RUN pip install -r ./requirements.txt
The content of the python script called mistral:
import ray
from ray import serve
from starlette.requests import Request
from llama_cpp import Llama
from fastapi import FastAPI
# Specify a runtime environment for the entire Ray job
ray.init(address='auto', runtime_env={"working_dir": "/home/ray/serve_app"})
app = FastAPI()
@serve.deployment(
route_prefix="/"
)
@serve.ingress(app)
class Mistral7BQ4KM:
def __init__(self):
# Put the location of to the GGUF model that you've download from HuggingFace here
model_path = "./mistral-7b-instruct-v0.1.Q4_K_M.gguf"
self.model = Llama(
model_path=model_path,
verbose=True,
n_ctx=516,
n_threads=4
)
@app.post("/mistral")
def completion(self, text: str) -> str:
# Prompt creation
system_message = "You are a helpful assistant"
user_message = txt
prompt = f"""<s>[INST] <<SYS>>
{system_message}
<</SYS>>
{user_message} [/INST]"""
outputs = self.model.create_completion(prompt, max_tokens=None)
return outputs["choices"][0]["text"]
deployment = Mistral7BQ4KM.bind()
The content of the rayservice YAML file:
apiVersion: ray.io/v1alpha1
kind: RayService
metadata:
name: rayservice-mistral
spec:
serviceUnhealthySecondThreshold: 300
deploymentUnhealthySecondThreshold: 300
serveConfigV2: |
applications:
- name: mistral
import_path: mistral:deployment
#route_prefix: /
runtime_env:
working_dir: "/home/ray/serve_app"
rayClusterConfig:
rayVersion: '2.9.2' # Should match Ray version in the containers
headGroupSpec:
rayStartParams:
dashboard-host: '0.0.0.0'
template:
spec:
containers:
- name: ray-head
image: ylecroart/ray-ml:2.9.2
resources:
limits:
cpu: 2
memory: 2Gi
requests:
cpu: 2
memory: 2Gi
ports:
- containerPort: 6379
name: gcs-server
- containerPort: 8265 # Ray dashboard
name: dashboard
- containerPort: 10001
name: client
- containerPort: 8000
name: serve
workerGroupSpecs:
- replicas: 1
minReplicas: 1
maxReplicas: 1
groupName: worker-group
rayStartParams: {}
template:
spec:
containers:
- name: ray-worker
image: ylecroart/ray-ml:2.9.2
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","ray stop"]
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "500m"
memory: "2Gi"
After executing the command kubectl apply I get the following error message on the dashboard:
Deploying app 'mistral' failed with exception:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/serve/_private/application_state.py", line 994, in build_serve_application
app = call_app_builder_with_args_if_necessary(import_attr(import_path), args)
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/_private/utils.py", line 1182, in import_attr
module = importlib.import_module(module_name)
File "/home/ray/anaconda3/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'mistral'
What am I doing wrong? I understand that Ray’s worker processes may be run from different working directories than my driver python script but I thought that specifying the absolute path would do the trick.
Could you please help me out.
Thanks in advance.
Regards,