Details on ray request routing and load balancing

Can we directly access serve deployment actors without going through the head node? How can we work around head node being a bottleneck?