Log and print statements appear in the terminal and docker logs until the Deployment, i.e. Api.deploy()
is called. Once our deployment actor, implemented with FastAPI ingress, is started, the log and print statements disappear.
Example terminal output with Ray v1.11.1 or older (working):
❯ docker logs api
2022-06-23 13:00:16,998 INFO scripts.py:600 -- Local node IP: 192.168.1.98
2022-06-23 13:00:18,747 INFO services.py:1412 -- View the Ray dashboard at http://192.168.1.98:8265
2022-06-23 13:00:19,066 SUCC scripts.py:640 -- --------------------
2022-06-23 13:00:19,066 SUCC scripts.py:641 -- Ray runtime started.
2022-06-23 13:00:19,066 SUCC scripts.py:642 -- --------------------
2022-06-23 13:00:19,066 INFO scripts.py:644 -- Next steps
2022-06-23 13:00:19,066 INFO scripts.py:645 -- To connect to this Ray runtime from another node, run
2022-06-23 13:00:19,066 INFO scripts.py:649 -- ray start --address='192.168.1.98:6379' --redis-password='5241590000000000'
2022-06-23 13:00:19,066 INFO scripts.py:654 -- Alternatively, use the following Python code:
2022-06-23 13:00:19,066 INFO scripts.py:656 -- import ray
2022-06-23 13:00:19,066 INFO scripts.py:660 -- ray.init(address='auto', _redis_password='5241590000000000')
2022-06-23 13:00:19,067 INFO scripts.py:670 -- To connect to this Ray runtime from outside of the cluster, for example to
2022-06-23 13:00:19,067 INFO scripts.py:672 -- connect to a remote cluster from your laptop directly, use the following
2022-06-23 13:00:19,067 INFO scripts.py:674 -- Python code:
2022-06-23 13:00:19,067 INFO scripts.py:676 -- import ray
2022-06-23 13:00:19,067 INFO scripts.py:677 -- ray.init(address='ray://<head_node_ip_address>:10001')
2022-06-23 13:00:19,067 INFO scripts.py:682 -- If connection fails, check your firewall settings and network configuration.
2022-06-23 13:00:19,067 INFO scripts.py:687 -- To terminate the Ray runtime, run
2022-06-23 13:00:19,067 INFO scripts.py:688 -- ray stop
(ServeController pid=224) 2022-06-23 13:00:20,562 INFO checkpoint_path.py:16 -- Using RayInternalKVStore for controller checkpoint and recovery.
(ServeController pid=224) 2022-06-23 13:00:20,677 INFO http_state.py:98 -- Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-node:192.168.1.98-0' on node 'node:192.168.1.98-0' listening on '127.0.0.1:7777'
2022-06-23 13:00:21,329 INFO api.py:521 -- Started detached Serve instance in namespace 'serve'.
(HTTPProxyActor pid=258) INFO: Started server process [258]
2022-06-23 13:00:24,572 DEBUG deploy:run -- Total Resources = {'node:192.168.1.98': 1.0, 'accelerator_type:RTX': 1.0, 'CPU': 32.0, 'object_store_memory': 40384510771.0, 'memory': 84230525133.0, 'GPU': 2.0}
2022-06-23 13:00:24,573 DEBUG deploy:run -- Available Resources = {'node:192.168.1.98': 0.98, 'GPU': 2.0, 'memory': 84230525133.0, 'CPU': 32.0, 'object_store_memory': 40384510771.0, 'accelerator_type:RTX': 1.0}
2022-06-23 13:00:25,503 INFO api.py:262 -- Updating deployment 'Api'. component=serve deployment=Api
(ServeController pid=224) 2022-06-23 13:00:25,602 INFO deployment_state.py:920 -- Adding 1 replicas to deployment 'Api'. component=serve deployment=Api
(Api pid=335) 2022-06-23 13:00:27,181 DEBUG api:most_recent_field_of_view -- records count=0
(Api pid=335) 2022-06-23 13:00:27,181 INFO api:most_recent_field_of_view -- Creating a default Field of View (FOV)
(Api pid=335) 2022-06-23 13:00:27,197 DEBUG api:most_recent_scale_bar -- records count=0
(Api pid=335) 2022-06-23 13:00:27,197 INFO api:most_recent_scale_bar -- Creating a default scale bar
(Api pid=335) 2022-06-23 13:00:27,202 DEBUG api:__init__ -- yolo_v5_weights_dir=/mnt/nvme/theia/YOLO/v5
(Api pid=335) 2022-06-23 13:00:27,202 DEBUG api:__init__ -- file_read_chunk_size=1024
2022-06-23 13:00:27,254 INFO api.py:274 -- Deployment 'Api' is ready at `http://127.0.0.1:7777/api`. component=serve deployment=Api
2022-06-23 13:00:27,276 DEBUG deploy:run -- Deployments = {'Api': Deployment(name=Api,version=None,route_prefix=/api)}
Example terminal output with Ray v1.12.0 or newer (problem):
❯ docker logs api
...
2022-06-23 12:54:38,725 INFO usage_lib.py:320 -- Usage stats collection is enabled by default without user confirmation because this stdin is detected to be non-interactively. To disable this, add `--disable-usage-stats` to the command that starts the cluster, or run the following command: `ray disable-usage-stats` before starting the cluster. See https://docs.ray.io/en/master/cluster/usage-stats.html for more details.
2022-06-23 12:54:38,725 INFO scripts.py:715 -- Local node IP: 192.168.1.98
2022-06-23 12:54:40,577 INFO services.py:1470 -- View the Ray dashboard at http://192.168.1.98:8265
2022-06-23 12:54:40,898 SUCC scripts.py:757 -- --------------------
2022-06-23 12:54:40,899 SUCC scripts.py:758 -- Ray runtime started.
2022-06-23 12:54:40,899 SUCC scripts.py:759 -- --------------------
2022-06-23 12:54:40,899 INFO scripts.py:761 -- Next steps
2022-06-23 12:54:40,899 INFO scripts.py:762 -- To connect to this Ray runtime from another node, run
2022-06-23 12:54:40,899 INFO scripts.py:765 -- ray start --address='192.168.1.98:6379'
2022-06-23 12:54:40,899 INFO scripts.py:770 -- Alternatively, use the following Python code:
2022-06-23 12:54:40,899 INFO scripts.py:772 -- import ray
2022-06-23 12:54:40,899 INFO scripts.py:776 -- ray.init(address='auto')
2022-06-23 12:54:40,899 INFO scripts.py:788 -- To connect to this Ray runtime from outside of the cluster, for example to
2022-06-23 12:54:40,899 INFO scripts.py:792 -- connect to a remote cluster from your laptop directly, use the following
2022-06-23 12:54:40,899 INFO scripts.py:796 -- Python code:
2022-06-23 12:54:40,900 INFO scripts.py:798 -- import ray
2022-06-23 12:54:40,900 INFO scripts.py:799 -- ray.init(address='ray://<head_node_ip_address>:10001')
2022-06-23 12:54:40,900 INFO scripts.py:808 -- If connection fails, check your firewall settings and network configuration.
2022-06-23 12:54:40,900 INFO scripts.py:816 -- To terminate the Ray runtime, run
2022-06-23 12:54:40,900 INFO scripts.py:817 -- ray stop
(ServeController pid=224) INFO 2022-06-23 12:54:42,497 controller 224 checkpoint_path.py:17 - Using RayInternalKVStore for controller checkpoint and recovery.
(ServeController pid=224) INFO 2022-06-23 12:54:42,600 controller 224 http_state.py:112 - Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-node:192.168.1.98-0' on node 'node:192.168.1.98-0' listening on '127.0.0.1:7777'
(HTTPProxyActor pid=256) INFO: Started server process [256]
2022-06-23 12:54:46,307 DEBUG deploy:run -- Total Resources = {'accelerator_type:RTX': 1.0, 'object_store_memory': 39170103296.0, 'node:192.168.1.98': 1.0, 'CPU': 32.0, 'GPU': 2.0, 'memory': 85425603584.0}
2022-06-23 12:54:46,308 DEBUG deploy:run -- Available Resources = {'CPU': 32.0, 'node:192.168.1.98': 0.98, 'GPU': 2.0, 'accelerator_type:RTX': 1.0, 'object_store_memory': 39170103296.0, 'memory': 85425603584.0}
2022-06-23 12:54:49,501 DEBUG deploy:run -- Deployments = {'Api': Deployment(name=Api,version=None,route_prefix=/api)}
The container/service is named, “api”. Terminal output, i.e. STDOUT, is the same but there are statements from other services and containers in the multi-container application. The following statements are missing from the output in Ray v1.12 or newer compared to Ray v1.11.1 or older:
(Api pid=335) 2022-06-23 13:00:27,181 DEBUG api:most_recent_field_of_view -- records count=0
(Api pid=335) 2022-06-23 13:00:27,181 INFO api:most_recent_field_of_view -- Creating a default Field of View (FOV)
(Api pid=335) 2022-06-23 13:00:27,197 DEBUG api:most_recent_scale_bar -- records count=0
(Api pid=335) 2022-06-23 13:00:27,197 INFO api:most_recent_scale_bar -- Creating a default scale bar
(Api pid=335) 2022-06-23 13:00:27,202 DEBUG api:__init__ -- yolo_v5_weights_dir=/mnt/nvme/theia/YOLO/v5
(Api pid=335) 2022-06-23 13:00:27,202 DEBUG api:__init__ -- file_read_chunk_size=1024
These missing statements are just part of the missing log statements. It is not just the Api
actor/deployment. After the Api.deploy()
is called, the log statements from any other task or actor are missing from the terminal output and docker logs. These still appear in separate logs in the Ray Dashboard for each task and actor.
The last three DEBUG
statements in the Ray v1.12 or newer output from deploy:run
are log statements before the Api.deploy()
function is called. The deploy.py
file:
logger = logging.get_logger(__name__)
def run(context: Configuration):
logger.debug("Total Resources = %s", ray.cluster_resources())
logger.debug("Available Resources = %s", ray.available_resources())
Api.deploy(context)
logger.debug("Deployments = %s", ray.serve.list_deployments())
if __name__ == "__main__":
run(server.start())
and the server.py
module for the server.start()
process. There is a separate deploy.py
module so I can setup a file watch and automatically refresh/redeploy/reload on code changes during development.
def start():
context = cli.run()
ray.init(address="auto", namespace="serve")
return context
def run():
deploy.run(start())
while True:
time.sleep(5)
if __name__ == "__main__":
run()
FYI, I have PYTHONUNBUFFERED=1
in the Dockerfile based on the Ray docker image, regardless of the Ray version.