Reinforcement Loader

class olympus.reinforcement.dataloader.RLDataLoader(dataset_environment, actor, critic, replay=None)[source]

Bases: object

Parameters:
dataset_environment:

Generic Reinforcement Learning environment

replay:

Replay Vector iterator constructor

transform:

Transform to apply to each simulation state

Methods

close  
shutdown  
test  
train  
valid  
close()[source]
shutdown()[source]
test()[source]
train(no_grad=False)[source]
valid()[source]
class olympus.reinforcement.dataloader.RLTorchIterator(environment, actor, critic, device=None, max_step=None, no_grad=False)[source]

Bases: object

Iterates through environment states

Parameters:
actor: Union[nn.Module, Callable]

Returns the action that should be taken

critic: Union[nn.Module, Callable]

Returns the value of the current state

max_step: Optional[int]

If unspecified the Iterator is infinite, else we stop after max_steps

no_grad: bool

Whether or not the actor and the critic should have their grad computed

Returns:
A dictionary representing the transition from one state to anoter
state: Tensor[NCHW, dtype=uint8]

State of the game before the action is taken for images (size: (num_parallel, 3, H, W))

new_state: Tensor[NCHW, dtype=uint8]

State of the game after the action is taken for images (size: (num_parallel, 3, H, W))

action: Tensor[num_parallel, dtype=int]

Return the action taken for each parallel simulation

log_prob: Tensor[num_parallel, dtype=float]
entropy: Tensor[num_parallel, dtype=float]
critic: Tensor[num_parallel, dtype=float]
reward: Tensor[num_parallel, dtype=float]
done: Tensor[num_parallel, dtype=bool]
info: List[dict] size: num_parallel

Methods

close  
to  
close()[source]
to(device)[source]
class olympus.reinforcement.dataloader.ReplayVectorIterator(iterator: olympus.reinforcement.dataloader.RLTorchIterator, num_steps)[source]

Bases: object

Aggregate Transition into a vector to be used for later

Attributes:
completed_simulations

Number of completed simulations since start

state

Return the latest state

Methods

close  
to  
close()[source]
completed_simulations

Number of completed simulations since start

state

Return the latest state

to(device)[source]
olympus.reinforcement.dataloader.simple_replay_vector(num_steps)[source]
olympus.reinforcement.dataloader.to_nchw(states)[source]