Following up on this, as I was able to fix the bug for my particular use case.
While my custom environment was running just fine in a custom test loop (simulating ad-hoc and heuristic control policies), there were some minor formatting inconsistencies with “expected” AI gym uses. I did not catch these until running my environment through stable baseline 3’s environment checker (env_check), which flagged the issues (e.g. passing an observation as a column vector as opposed to a 1D array - sorry, I am a maths person…a 1D array still doesn’t make great sense to me).
Anywho, after flattening my observations the above “assert priority > 0” error stopped being thrown in RLLib. So that seems to have been the cause. However, the error message itself was pretty unhelpful in diagnosing this problem - were it not for sb3’s help, I’d not have fixed it (likely).
Not sure how development issues are raised for this project, but adding a compatibility checking tool for custom environments so the above does not happen to others seems like it may be a useful Quality of Life improvement.