Configuration for infinite horizon (continuous/non-episodic) environments?

What parameters do you pass to AlgorithmConfig for infinite horizon MDPs (continuous/non-episodic)? I have found the following links (below) on this, but they all involve old rllib versions and there have been significant API changes since then.

Currently when I train in such an environment, it simply gives a reward of ‘nan’ since the environment doesn’t terminate within the default 100 steps.

Thanks!