Reinforcement Loader¶

class olympus.reinforcement.dataloader.RLDataLoader(dataset_environment, actor, critic, replay=None)[source]¶

Bases: object

Parameters:	dataset_environment: Generic Reinforcement Learning environment replay: Replay Vector iterator constructor transform: Transform to apply to each simulation state

Methods

close
shutdown
test
train
valid

close()[source]¶

shutdown()[source]¶

test()[source]¶

train(no_grad=False)[source]¶

valid()[source]¶

class olympus.reinforcement.dataloader.RLTorchIterator(environment, actor, critic, device=None, max_step=None, no_grad=False)[source]¶

Bases: object

Iterates through environment states

Parameters:

actor: Union[nn.Module, Callable]: Returns the action that should be taken
critic: Union[nn.Module, Callable]: Returns the value of the current state
max_step: Optional[int]: If unspecified the Iterator is infinite, else we stop after max_steps
no_grad: bool: Whether or not the actor and the critic should have their grad computed

Returns:

A dictionary representing the transition from one state to anoter
state: Tensor[NCHW, dtype=uint8]: State of the game before the action is taken for images (size: (num_parallel, 3, H, W))
new_state: Tensor[NCHW, dtype=uint8]: State of the game after the action is taken for images (size: (num_parallel, 3, H, W))
action: Tensor[num_parallel, dtype=int]: Return the action taken for each parallel simulation
log_prob: Tensor[num_parallel, dtype=float]
entropy: Tensor[num_parallel, dtype=float]
critic: Tensor[num_parallel, dtype=float]
reward: Tensor[num_parallel, dtype=float]
done: Tensor[num_parallel, dtype=bool]
info: List[dict] size: num_parallel

Methods

close
to

close()[source]¶

to(device)[source]¶

class olympus.reinforcement.dataloader.ReplayVectorIterator(iterator: olympus.reinforcement.dataloader.RLTorchIterator, num_steps)[source]¶

Bases: object

Aggregate Transition into a vector to be used for later

Attributes:	`completed_simulations` Number of completed simulations since start `state` Return the latest state

Methods

close
to

close()[source]¶

completed_simulations¶: Number of completed simulations since start

state¶: Return the latest state

to(device)[source]¶

olympus.reinforcement.dataloader.simple_replay_vector(num_steps)[source]¶

olympus.reinforcement.dataloader.to_nchw(states)[source]¶