Reinforcement Loader¶
-
class
olympus.reinforcement.dataloader.RLDataLoader(dataset_environment, actor, critic, replay=None)[source]¶ Bases:
objectParameters: - dataset_environment:
Generic Reinforcement Learning environment
- replay:
Replay Vector iterator constructor
- transform:
Transform to apply to each simulation state
Methods
close shutdown test train valid
-
class
olympus.reinforcement.dataloader.RLTorchIterator(environment, actor, critic, device=None, max_step=None, no_grad=False)[source]¶ Bases:
objectIterates through environment states
Parameters: - actor: Union[nn.Module, Callable]
Returns the action that should be taken
- critic: Union[nn.Module, Callable]
Returns the value of the current state
- max_step: Optional[int]
If unspecified the Iterator is infinite, else we stop after max_steps
- no_grad: bool
Whether or not the actor and the critic should have their grad computed
Returns: - A dictionary representing the transition from one state to anoter
- state: Tensor[NCHW, dtype=uint8]
State of the game before the action is taken for images (size: (num_parallel, 3, H, W))
- new_state: Tensor[NCHW, dtype=uint8]
State of the game after the action is taken for images (size: (num_parallel, 3, H, W))
- action: Tensor[num_parallel, dtype=int]
Return the action taken for each parallel simulation
- log_prob: Tensor[num_parallel, dtype=float]
- entropy: Tensor[num_parallel, dtype=float]
- critic: Tensor[num_parallel, dtype=float]
- reward: Tensor[num_parallel, dtype=float]
- done: Tensor[num_parallel, dtype=bool]
- info: List[dict] size: num_parallel
Methods
close to
-
class
olympus.reinforcement.dataloader.ReplayVectorIterator(iterator: olympus.reinforcement.dataloader.RLTorchIterator, num_steps)[source]¶ Bases:
objectAggregate Transition into a vector to be used for later
Attributes: completed_simulationsNumber of completed simulations since start
stateReturn the latest state
Methods
close to -
completed_simulations¶ Number of completed simulations since start
-
state¶ Return the latest state