I tried out another way to achieve persistent variables in a worker process. Apart from global variables also variables in modules are preserved across function calls in Python (unless the module is explicitly reimported). So what I did is to create a simple module called “process_persistent” and placed it in the Lib folder of Python. It simply contains a dict “data” to store arbitrary data and a cleanUp function to call destructors if needed:
data = {}
def cleanUp():
for d in data.values():
del d
Now if use the data variable of this module instead of a global variable, everything works fine with:
import ray
import win32com.client as win32
def f(x):
import process_persistent as persistent
if not 'excel' in persistent.data.keys():
persistent.data['excel'] = win32.Dispatch('Excel.Application')
print('Launched Excel')
else:
print('Reusing Excel')
return persistent.data['excel'].Evaluate(str(x) + '*2')
class MyProblem:
def __init__(self):
self.func = f
def evalFunc(problem,x):
return problem.func(x)
ray.init(num_cpus=1)
evalFuncRemote = ray.remote(evalFunc)
problem = MyProblem()
problemRemote = ray.put(problem)
futures = [evalFuncRemote.remote(problemRemote,x) for x in range(5)]
print(ray.get(futures))
This gives the output:
[0.0, 2.0, 4.0, 6.0, 8.0]
(evalFunc pid=40496) Launched Excel
(evalFunc pid=40496) Reusing Excel
(evalFunc pid=40496) Reusing Excel
(evalFunc pid=40496) Reusing Excel
(evalFunc pid=40496) Reusing Excel
So the issue seems to be resolved for me. Any other ideas and comments are highly appreciated before I mark this as resolved.
Thanks again!