TypeError: Failed to serialize the ASGI app.:

High: Completely blocks me.

  • Ray version: 2.51.0
  • Python version: 3.12.12
  • OS: Windows
  • Cloud/Infrastructure: Colab

I keep getting this error when i try to run the below code.

TypeError: self.handle cannot be converted to a Python object for pickling

The above exception was the direct cause of the following exception:


TypeError                                 Traceback (most recent call last)

/tmp/ipython-input-73983139.py in <cell line: 0>()
      9 
     10 @ray.serve.deployment
---> 11 @ray.serve.ingress(app)
     12 class Ensemble:
     13     def __init__(self, model1, model2):


/usr/local/lib/python3.12/dist-packages/ray/serve/api.py in decorator(cls)
    302             ensure_serialization_context()
    303             frozen_app_or_func = cloudpickle.loads(
--> 304                 pickle_dumps(app, error_msg="Failed to serialize the ASGI app.")
    305             )
    306 


/usr/local/lib/python3.12/dist-packages/ray/_common/serialization.py in pickle_dumps(obj, error_msg)
     30         msg = f"{error_msg}:\n{sio.getvalue()}"
     31         if isinstance(e, TypeError):
---> 32             raise TypeError(msg) from e
     33         else:
     34             raise ray.exceptions.OufOfBandObjectRefSerializationException(msg)


TypeError: Failed to serialize the ASGI app.:

app = fastapi.FastAPI()

class Payload(BaseModel):

passenger_count: int

trip_distance: float

fare_amount: float

tolls_amount: float

@ray.serve.deployment

@ray.serve.ingress(app)

class Ensemble:

def \__init_\_(self, model1, model2):

    self.model1 = model1

    self.model2 = model2



@app.post("/predict")

async def predict(self, data: Payload) -> dict:

    model1_prediction, model2_prediction = await asyncio.gather(

        self.model1.predict.remote(\[data.model_dump()\]),

        self.model2.predict.remote(\[data.model_dump()\]),

    )

    out = {"prediction": float(model1_prediction + model2_prediction) / 2}

    return out

@ray.serve.deployment

class Model:

def \__init_\_(self, path: str):

    self.\_model = xgboost.Booster()

    self.\_model.load_model(path)



def predict(self, data: list\[dict\]) -> list\[float\]:

    \# Make prediction

    dmatrix = xgboost.DMatrix(pd.DataFrame(data))

    model_prediction = self.\_model.predict(dmatrix)

    return model_prediction

# Run the deployment

handle = ray.serve.run(

Ensemble.bind(

    model1=Model.bind(model_path),

    model2=Model.bind(model_path),

),

route_prefix="/ensemble"

)

The error TypeError: Failed to serialize the ASGI app.: ... occurs because Ray Serve tries to pickle (serialize) the FastAPI app object, but something inside the app (or its dependencies) is not serializable. This is often caused by including non-serializable objects (like Ray handles or other complex objects) as global variables or as part of the FastAPI app state, or by capturing them in closures or class attributes that are referenced by the app at the time of serialization. According to the Ray Serve API, the FastAPI app passed to @serve.ingress(app) must be serializable, and any Ray handles or other non-serializable objects should be created inside the deployment class, not at the module/global level or attached to the app object itself (source, source).

To fix this, ensure that:

  • The FastAPI app is defined at the top level and does not reference any Ray handles or non-serializable objects.
  • Ray handles (self.model1, self.model2) are only assigned inside the deployment class’s __init__ and not attached to the FastAPI app or used as dependencies in the app at definition time.
  • Do not use global variables or closures that capture Ray handles or other non-serializable objects in the FastAPI app.

Would you like a step-by-step example of how to structure your code to avoid this serialization error?

Sources:

Hint: Mention @RunLLM in the post for followups.

import fastapi

from pydantic import BaseModel

import ray

from ray import serve

from ray.serve.handle import DeploymentHandle

import xgboost

import pandas as pd

import asyncio

@ray.serve.deployment

class Ensemble:

def \__init_\_(self, model1: DeploymentHandle, model2: DeploymentHandle):

    self.model1 = model1

    self.model2 = model2

    

    \# Define Payload model inside \__init_\_ to avoid serialization issues

    class Payload(BaseModel):

        passenger_count: int

        trip_distance: float

        fare_amount: float

        tolls_amount: float

    

    \# Create FastAPI app inside the deployment

    self.app = fastapi.FastAPI()

    

    \# Create endpoint function

    async def predict_endpoint(data: Payload):

        model1_prediction, model2_prediction = await asyncio.gather(

            self.model1.predict.remote(\[data.model_dump()\]),

            self.model2.predict.remote(\[data.model_dump()\]),

        )

        out = {"prediction": float(model1_prediction + model2_prediction) / 2}

        return out

    

    \# Register the endpoint

    self.app.post("/predict")(predict_endpoint)



async def \__call_\_(self, request: fastapi.Request):

    return await self.app(request.scope, request.receive, request.\_send)

@ray.serve.deployment

class Model:

def \__init_\_(self, path: str):

    self.\_model = xgboost.Booster()

    self.\_model.load_model(path)



def predict(self, data: list\[dict\]) -> list\[float\]:

    \# Make prediction

    dmatrix = xgboost.DMatrix(pd.DataFrame(data))

    model_prediction = self.\_model.predict(dmatrix)

    return model_prediction

# Make sure Ray is initialized

if not ray.is_initialized():

ray.init()

# Run the deployment

handle = ray.serve.run(

Ensemble.bind(

    model1=Model.bind(model_path),

    model2=Model.bind(model_path),

),

route_prefix="/ensemble"

)
i resolved the eror