Replay Vector¶
-
class
olympus.reinforcement.replay.ReplayVector[source]¶ Bases:
objectHolds all the state transition of the simulation for training purposes
Notes
- Steps:
- Number of Simulation Steps
- Simulation:
- Number of parallel simulation
Examples
The output below shows the size of each fields with
num_steps=32num_simulation=4and with a state size of3, 210, 160(images of the simulation)>>> replay.describe() >>> rewards : torch.Size([32, 4]) >>> states : torch.Size([32, 4, 3, 210, 160]) >>> next_states : torch.Size([32, 4, 3, 210, 160]) >>> critic_values: torch.Size([32, 4]) >>> actions : torch.Size([32, 4]) >>> log_probs : torch.Size([32, 4]) >>> mask : torch.Size([32, 4])
Attributes: - transitions:
List of all the stored transitions
- state_size:
Size of the simulation state
- simulation_batch:
Number of different simulation state in one Transition Struct
- grad_batch:
Total number of states in this object grad_batch = simulation_batch * len(transitions)
>>> * <------------------- steps ---------------------------------> >>> ^ [states 0] [states 1] [states 2] [states 3] >>> | [states 0] [states 1] [states 2] >>> | [states 0] [states 1] [states 2] [states 3] >>> v [states 0] [states 1] [states 2] [states 3] [states 4] >>> * <------------------- steps ---------------------------------> >>> Batch 0 Batch 1 Batch 2 Batch 3 Batch 4
Methods
actions(self)Returns: next_states(self)Returns: states(self)Returns: append critic_values describe entropies log_probs masks rewards to_dict -
grad_batch¶
-
simulation_batch¶
-
state_size¶
-
transitions¶
-
class
olympus.reinforcement.replay.Transition(state, action, reward, log_prob, entropy, critic, mask, next_state)¶ Bases:
tupleAttributes: Methods
count(self, value, /)Return number of occurrences of value. index(self, value[, start, stop])Return first index of value. -
action¶ Alias for field number 1
-
critic¶ Alias for field number 5
-
entropy¶ Alias for field number 4
-
log_prob¶ Alias for field number 3
-
mask¶ Alias for field number 6
-
next_state¶ Alias for field number 7
-
reward¶ Alias for field number 2
-
state¶ Alias for field number 0
-