Stack overflow with heavy recursive model

- High: It blocks me to complete my task.

Hi, I am dealing with actuarial model which is very heavy recursive model.

It is too heavy to share here so… plz understand it.

The remoted python class consists of around 2,000 functions and 100 tables (about 1GB csv files)
and each function has 1,200 steps (100year * 12month) and it is complexly interconnected.

before python 3.11, because of stack overflow, it is not executable but from 3.11 it is working.

I am now using python 3.11 and latest version of ray with Windows.

When I tried model, the following messages are displayed.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\LG\ray_test\Lib\site-packages\ray\actor.py", line 191, in __call__
    raise TypeError(
TypeError: Actor methods cannot be called directly. Instead of running 'object.BEL()', try 'object.BEL.remote()'.
>>> ray.get(cs.BEL.remote(50))
(Projection pid=4136) Windows fatal exception: stack overflow
(Projection pid=4136)
(Projection pid=4136) Stack (most recent call first):
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\dtypes\missing.py", line 300 in _isna_array
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\dtypes\missing.py", line 213 in _isna
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\dtypes\missing.py", line 178 in isna
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\indexes\base.py", line 2810 in _isnan
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\indexes\base.py", line 2840 in hasnans
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\dtypes\dtypes.py", line 575 in validate_categories
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\dtypes\dtypes.py", line 378 in _finalize
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\dtypes\dtypes.py", line 221 in __init__
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\arrays\categorical.py", line 473 in __init__
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\arrays\categorical.py", line 3042 in factorize_from_iterable
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\arrays\categorical.py", line 3069 in <genexpr>
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\arrays\categorical.py", line 3069 in factorize_from_iterables
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\pandas\core\indexes\multi.py", line 533 in from_arrays
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6621 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 6643 in f_DISC_A_PC
(Projection pid=4136)   File "C:\Users\LG\ray_test\Lib\site-packages\ray\util\tracing\tracing_helper.py", line 467 in _resume_span
(Projection pid=4136)   File "C:\Users\LG\separated_240423.py", line 22628 in DISC_A_PC
(Projection pid=4136)   ...
(raylet) WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
(raylet) E0000 00:00:1713914174.701604    2908 wire_format_lite.cc:626] String field 'ray.rpc.WorkerTableData.exit_detail' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
(raylet) [2024-04-24 08:16:14,702 E 24840 2908] (raylet.exe) logging.cc:97: Unhandled exception: class nlohmann::detail::type_error. what(): [json.exception.type_error.316] invalid UTF-8 byte at index 521: 0xF6
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:104: Stack trace:
(raylet)  unknown
(raylet) terminate
(raylet) _chkstk
(raylet) RtlFindCharInUnicodeString
(raylet) RtlRaiseException
(raylet) RaiseException
(raylet) CxxThrowException
(raylet) BaseThreadInitThunk
(raylet) RtlUserThreadStart
(raylet) *** SIGABRT received at time=1713914174 ***
(raylet)     @   00007FFED184F6A1  (unknown)  abort
(raylet)     @   00007FF7176458A6  (unknown)  (unknown)
(raylet)     @   00007FFED184EDD2  (unknown)  terminate
(raylet)     @   00007FFEA3A21AB1  (unknown)  _NLG_Return2
(raylet)     @   00007FFED44D440F  (unknown)  _chkstk
(raylet)     @   00007FFED444E466  (unknown)  RtlFindCharInUnicodeString
(raylet)     @   00007FFED4484465  (unknown)  RtlRaiseException
(raylet)     @   00007FFED1B953AC  (unknown)  RaiseException
(raylet)     @   00007FFEA0CC6BA7  (unknown)  CxxThrowException
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361: *** SIGABRT received at time=1713914174 ***
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFED184F6A1  (unknown)  abort
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FF7176458A6  (unknown)  (unknown)
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFED184EDD2  (unknown)  terminate
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFEA3A21AB1  (unknown)  _NLG_Return2
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFED44D440F  (unknown)  _chkstk
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFED444E466  (unknown)  RtlFindCharInUnicodeString
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFED4484465  (unknown)  RtlRaiseException
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFED1B953AC  (unknown)  RaiseException
(raylet) [2024-04-24 08:16:14,711 E 24840 2908] (raylet.exe) logging.cc:361:     @   00007FFEA0CC6BA7  (unknown)  CxxThrowException
(pid=gcs_server) E0000 00:00:1713914174.702005   19680 wire_format_lite.cc:626] String field 'ray.rpc.WorkerTableData.exit_detail' contains invalid UTF-8 data when parsing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\LG\ray_test\Lib\site-packages\ray\_private\auto_init_hook.py", line 21, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\LG\ray_test\Lib\site-packages\ray\_private\client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\LG\ray_test\Lib\site-packages\ray\_private\worker.py", line 2667, in get
    values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\LG\ray_test\Lib\site-packages\ray\_private\worker.py", line 866, in get_objects
    raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
        class_name:
        actor_id: afb0aa3925ea999ef0abb64b01000000

>>> (raylet) The node with node id: 911343d78ee6f3936e2d0d18cdc7790ff5a0d908263d3e81f41da1b0 and address: 127.0.0.1 and node name: 127.0.0.1 has been marked dead because the detector has missed too many heartbeats from it. This can happen when a      (1) raylet crashes unexpectedly (OOM, preempted node, etc.)
        (2) raylet has lagging heartbeats due to slow network or busy workload.

What is the calculation limitation of Ray and is there any recommendations to address this issue?
or if I use Linux instead of Windows, this can be addressed?
(I heard that Windows have limitations in Ray)

FYI, sys.setrecursionlimit(10**9) is applied in the remoted class.