Best way to scale ingestion of IoT sensor streams with Ray?

For high-volume, low-latency ingestion of small sensor events in Ray, the recommended pattern is to use async actors or actor pools for I/O-bound workloads, as they efficiently handle concurrent event streams and scale with the number of CPU cores. Ray Serve can also be used for HTTP-based ingestion, providing autoscaling and request routing, but for pure event streaming (especially if not HTTP), async actors are often more efficient. Ray’s built-in queue (ray.util.queue.Queue) is available but is generally slower and less scalable than actor-based approaches for this use case. For batch or offline processing, Ray Data is preferred, but for real-time streaming, async actors or Serve deployments with async endpoints are best suited.

Lessons learned:

  • For I/O-bound sensor streams, start with a single async actor and load test; scale to multiple actors if the event loop saturates.
  • Use actor pools to distribute events across N actors for load balancing.
  • Ray Serve is ideal if you want HTTP endpoints with autoscaling and built-in request routing.
  • Avoid using Ray’s distributed Queue for high-throughput ingestion, as it introduces more overhead than actor-based solutions.
  • For MQTT/WebSocket ingestion, use an async actor to subscribe and process messages, then fan out to worker actors if needed.

Would you like a code example or more detail on a specific ingestion pattern?

Sources:

Hint: Mention @RunLLM in the post for followups.