WSGI web serving error

Hi

I am trying to set up a flask app on a web server using WSGI serving to host my agent so I can call it as an API. I am getting the following error:

/usr/local/bin/uwsgi: invalid option -- 'u'
getopt_long() error

My skillset doesn’t really cover this area, but luckily the great staff from pythonanywhere are helping me debug this.

What they think might be happening is that Ray is trying to work out which Python version it’s running under, so it is somehow finding out what interpreter was used to start it up. Normally it would expect that to be something like “/usr/bin/python3.8”, so it would then take that and add on the extra parameters it wants to start a new Python process with various things running.

However, uwsgi, being a web server (that has Python plugins to run Python code) takes different command-line options to the ones that regular Python interpreters take, so it’s not recognising the ones being passed in and exiting immediately on startup, so the Ray cluster never starts up properly.

Is there to be some way that you could tell Ray (perhaps in its init method) what Python interpreter path it should use, to prevent it from guessing based on the command line for the currently executing process?

Hey,

Thanks for your bug report! Do you have a reproducable script including the way you invoked the web server so we can look into this more?

Best,
Philipp.

Hi Philipp,

The app is structured as follows:

from flask import Flask
from flask_restful import Api, Resource, reqparse
import numpy as np
import ray
from ray.tune.registry import register_env
import ray.tune as tune
from batt_env import BatteryEnv
from ray.rllib.agents.ppo import PPOTrainer, DEFAULT_CONFIG
import requests
from flask import request
from datetime import datetime


print(f"{datetime.now()}: about to init ray", flush=True)
ray.init(ignore_reinit_error=True)

app = Flask(__name__)
API = Api(app)

print(f"{datetime.now()}: about to load stacked", flush=True)

data = np.load('/home/carterb/mysite/stacked.npy')
data = data/8000

batt = BatteryEnv(data, power=20, capacity=20, initial_charge=0, bleed=0.1,
                  starting_temperature=23, temp_change=1, cooldown_rate=1, efficiency=1.0, cycle_cost=0)

print(f"{datetime.now()}: about to do ", flush=True)

def env_creator(env_config):
    return BatteryEnv(data, power=20, capacity=20, initial_charge=0, bleed=0.1, starting_temperature=23, temp_change=1, cooldown_rate=1, efficiency=1.0, cycle_cost=0)

print(f"{datetime.now()}: about to do create env ", flush=True)

register_env("battery", env_creator)

config = {
    "env": "battery",
    "num_workers": 1,
    'explore': False,
    'log_level': 'DEBUG'
}
print(f"{datetime.now()}: about to do restore agent ", flush=True)

trained_trainer = PPOTrainer(config, 'battery')

trained_trainer.restore("/home/carterb/mysite/checkpoint-12000")

print(f"{datetime.now()}: about to do gen predict class ", flush=True)

class Predict(Resource):

    @staticmethod
    def post():
        # parser = reqparse.RequestParser()
        # parser.add_argument('obs')
        print(f"{datetime.now()}: about to accept input ", flush=True)

        input_data = request.json
        arr = np.array(input_data['obs'])
        # print(arr.shape)
        # print(input_data)
        #
        # args = parser.parse_args()  # creates dict
        #
        # print(args)

        # X_new = np.fromiter(args.values(), dtype=float)
        # print(X_new.shape)
        state = []
        done = False  # convert input to array
        print(f"{datetime.now()}: about to do compute action class ", flush=True)

        action = trained_trainer.compute_action(arr)
        print(f"{datetime.now()}: about to return action ", flush=True)

        # print('action: ' + str(action))
        out = {'Action': int(action)}
        #
        return out, 200

print(f"{datetime.now()}: about to do add resource  ", flush=True)

API.add_resource(Predict, '/predict', methods=['POST'])
print(f"{datetime.now()}: about to do resource added ", flush=True)



print(f"{datetime.now()}: trying to run app ", flush=True)

app.run(debug=True)

It is invoked through:

# This file contains the WSGI configuration required to serve up your
# web application at http://<your-username>.pythonanywhere.com/
# It works by setting the variable 'application' to a WSGI handler of some
# description.
#
# The below has been auto-generated for your Flask project

import sys

# add your project directory to the sys.path
project_home = '/home/carterb/mysite'
if project_home not in sys.path:
    sys.path = [project_home] + sys.path

# import flask app but need to call it "application" for WSGI to work
from flask_app import app as application  # noqa

If you need the actual agent, with example data I would happily send it over if you have an email?

Thanks, and then how do you start/invoke the application on the command line and where do you put these files?

[I don’t think we need the data, can probably just remove the code that involves the data]

I am unsure, on my end it just shows ‘reload [name of my website]’. But i will ask and find out for you, thanks!

In terms of the file structure:

WSGI configuration file:
/var/www/carterb_eu_pythonanywhere_com_wsgi.py

The actual web app (and accompanying files that are used) sits in a directory:
[/home/carterb/mysite])

Hi @Carterbouley, is uwsgi necessary? Can you try gunicorn? We are aware there might be some issue with uwsgi or mod_wsgi compatibility issue: OSError: Apache/mod_wsgi log object is not associated with a file descriptor. · Issue #13716 · ray-project/ray · GitHub

Alternatively, if you just want to use some web server that talks to Ray cluster, checkout http://rayserve.org/.

Hi Simon

I’ve been trying to set up using ray serve but still having trouble. Do you know of any full end to end projects I could find to use as a template?

@simon-mo The issue still persist, any idea how to resolve?
@Carterbouley Did you manage to have this resolved?

One more thing, it seems the issue doesn’t show when local_mode=True during initialization (which runs on a single process and for testing only). We can’t use the local_mode for production as that defeats the purpose of parallelism of ray.