Ray on Spark example failing

I’m attempting to run the example code for Ray on Spark, found here: Basic Example.

This is using Ray 2.5.1 and PySpark 3.3.2.

When I run the example script, I’m seeing the error below. Is this an issue with my environment? I’m trying to understand how we can use Ray and Spark together, so any help with this would be much appreciated.

23/07/13 16:43:18 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
Traceback (most recent call last):
  File "ra_spark_support/test.py", line 16, in <module>
    setup_ray_cluster(num_worker_nodes=MAX_NUM_WORKER_NODES)
  File "/home/ubuntu/dev/spark-support/venv/lib/python3.8/site-packages/ray/util/spark/cluster_init.py", line 987, in setup_ray_cluster
    ) = get_avail_mem_per_ray_worker_node(
  File "/home/ubuntu/dev/spark-support/venv/lib/python3.8/site-packages/ray/util/spark/utils.py", line 344, in get_avail_mem_per_ray_worker_node
    spark.sparkContext.parallelize([1], 1).map(mapper).collect()[0]
  File "/home/ubuntu/dev/spark-support/venv/lib/python3.8/site-packages/pyspark/rdd.py", line 1197, in collect
    sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
  File "/home/ubuntu/dev/spark-support/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1321, in __call__
    return_value = get_return_value(
  File "/home/ubuntu/dev/spark-support/venv/lib/python3.8/site-packages/pyspark/sql/utils.py", line 190, in deco
    return f(*a, **kw)
  File "/home/ubuntu/dev/spark-support/venv/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (ip-172-31-31-1.ec2.internal executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 668, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 85, in read_command
    command = serializer._read_with_length(file)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 173, in _read_with_length
    return self.loads(obj)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 471, in loads
    return cloudpickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'ray.util.spark'

Thanks,
Jason

oh, this is because you don’t install ray package in spark executor python environment. :slight_smile: