1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
[-] Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.
2. Environment:
- Ray version: 2.42.0
- Python version: 3.12.10
- OS: macOS
- Cloud/Infrastructure:
- Other libs/tools (if relevant):
3. What happened vs. what you expected:
- Expected: When I call the ray remote task, it should start the process
- Actual: instead i am getting an error saying too many positional arguments
I am fairly new to ray, just started understanding how it works.
I have installed ray using pip install ray[default]==2.42.0
. I wanted to try out my tasks in local before proceeding to the deployment.
Ray Initialization (in the lifespan hook of fastapi)
try:
if os.getenv("ENVIRONMENT") == "production":
ray_address = os.getenv("RAY_ADDRESS", "ray://ray-head:10001")
ray.init(address=ray_address)
logger.info(f"Connected to Ray cluster at {ray_address}")
else:
import tempfile
import uuid
unique_id = str(uuid.uuid4())[:8]
temp_dir = f"/tmp/ray_{unique_id}"
try:
ray.init(address="auto", ignore_reinit_error=True)
logger.info("Connected to existing local Ray cluster")
except:
ray.init(
ignore_reinit_error=True,
include_dashboard=True,
dashboard_port=0, # Let Ray find an available port
_temp_dir=temp_dir,
)
logger.info(f"Started new local Ray cluster with temp_dir: {temp_dir}")
logger.info(f"Ray cluster resources: {ray.cluster_resources()}")
except Exception as e:
logger.exception(f"Failed to initialize Ray: {e}")
raise
# I have tried with include_dashboard=False as well.
My fastAPI endpoint looks like following:
@router.post("/file-upload")
async def file_upload(
project_id: str,
connection_id: str,
user_id: str,
file: UploadFile = File(...),
database: str = "milvus",
container_name: str = "askbodhi-test",
parser_backend: str = FileParser.pypdf,
chunkingStrategy: str = ChunkingStrategy.Recursive,
spacy_model: str = "en_core_web_sm",
image_weight: float = 0.5,
tokenisation: str = Tokenisation.Word,
chunk_size: int = 1000,
chunk_overlap: int = 200,
programming_language: str = "python",
):
try:
initialize_task_status(project_id, connection_id, user_id)
task_status_storage[f"{project_id}_{connection_id}_{user_id}"].job_status = JobStatus.PROCESSING
task_status_storage[f"{project_id}_{connection_id}_{user_id}"].current_step = "check_collection"
task_status_storage[f"{project_id}_{connection_id}_{user_id}"].process_percentage = 0
task_status_storage[f"{project_id}_{connection_id}_{user_id}"].started_at = datetime.now()
task_status_storage[f"{project_id}_{connection_id}_{user_id}"].updated_at = datetime.now()
process_unstructured.remote(
task_status_storage=task_status_storage,
project_id=project_id,
connection_id=connection_id,
user_id=user_id,
file=file,
... other arguments
)
return task_status_storage[f"{project_id}_{connection_id}_{user_id}"]
except Exception as e:
raise Exception(e)
The ray remote function looks like the following:
from fastapi import UploadFile
@ray.remote
def process_unstructured(
task_status_storage: Dict[str, TaskStatusModel],
project_id: str,
connection_id: str,
user_id: str,
file: UploadFile,
...
):
... logic goes here
# The file here is the file recieved as input from the user in the endpoint above
# I have also tried using reading the path in a temp folder, and passing the file path still the same error.
The error I get:
2025-06-15 15:33:28,124 ERROR worker.py:422 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::process_unstructured() (pid=64422, ip=127.0.0.1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/navpenum/Documents/InsightsIQ/data-converse-product/bodhi-data-converse-api-src/venv/lib/python3.12/site-packages/ray/remote_function.py", line 156, in _remote_proxy
return self._remote(
^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/navpenum/Documents/InsightsIQ/data-converse-product/bodhi-data-converse-api-src/venv/lib/python3.12/site-packages/ray/remote_function.py", line 504, in _remote
return invocation(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/navpenum/Documents/InsightsIQ/data-converse-product/bodhi-data-converse-api-src/venv/lib/python3.12/site-packages/ray/remote_function.py", line 462, in invocation
list_args = ray._private.signature.flatten_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: too many positional arguments
During handling of the above exception, another exception occurred:
ray::process_unstructured() (pid=64422, ip=127.0.0.1)
File "/Users/navpenum/Documents/InsightsIQ/data-converse-product/bodhi-data-converse-api-src/dataconverse/ray/unstructured.py", line 135, in process_unstructured
raise Exception(e)
Exception: too many positional arguments
(pid=64422) /Users/navpenum/Documents/InsightsIQ/data-converse-product/bodhi-data-converse-api-src/venv/lib/python3.12/site-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_provider" in VectorizerConfiguration has conflict with protected namespace "model_".
(pid=64422)
(pid=64422) You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
(pid=64422) warnings.warn(
I am running the fastapi application using uvicorn and the ray is initialized as shown above in a lifespan hook. (Not using ray serve or ray start --head command seperately.
Also what does too many positional arguments mean?
PS:
- I double checked and assure that the number of arguments are exactly the same
- Please Ignore the other warnings and errors unrelated to ray
- Please also let me know If I need to give any additional information