Raydp deployment and 3rd party dependency issue

I have 2 issues…

  1. I am using ray(1.9.2) on kubernetics cluster with raydp. But at the time of new job deployment,
    to create spark session with raydp.init_spark I am getting following exception
    File “app/spark_on_ray.py”, line 8, in
    spark = raydp.init_spark(app_name=‘RayDP Example’,num_executors=2,executor_cores=1,executor_memory=‘4GB’)
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/context.py”, line 126, in init_spark
    return _global_spark_context.get_or_create_session()
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/context.py”, line 70, in get_or_create_session
    spark_cluster = self._get_or_create_spark_cluster()
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/context.py”, line 63, in _get_or_create_spark_cluster
    self._spark_cluster = SparkCluster(self._configs)
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/spark/ray_cluster.py”, line 34, in init
    self._set_up_master(None, None)
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/spark/ray_cluster.py”, line 40, in _set_up_master
    self._app_master_bridge.start_up()
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/spark/ray_cluster_master.py”, line 56, in start_up
    self._set_properties()
    File “/home/ray/anaconda3/lib/python3.7/site-packages/raydp/spark/ray_cluster_master.py”, line 145, in _set_properties
    options[“ray.node-ip”] = node.node_ip_address
    AttributeError: ‘NoneType’ object has no attribute ‘node_ip_address’

  2. How do I include additional jar like spark-cassandra-connetor.jar to raydp?

Hi @Ritapa_Kundu , thanks for trying raydp! Sorry for late reply.
I guess it’s better to discuss about this in the issues of raydp repo, since it’s mainly raydp’s issue.

  1. Where are you running the script? If you are running it on a machine which is not in the ray cluster, you should use ray client to connect to it, and put raydp.init_spark in an remote actor or task.
  2. With raydp nightly, you can include extra jars by setting raydp.executor.extraClassPath in configs of raydp.init_spark.