Hi @carlorop,
The short answer is that agents that map to the same policy use the same models and buffer.
Here is a slightly more involved answer:
Does it make sense?