SGD

class olympus.optimizers.sgd.SGD(model_parameters, weight_decay, lr, momentum)[source]

Bases: olympus.optimizers.base.OptimizerAdapter

SGD with momentum, more on wikipedia

References

[1]Aleksandar Botev, Guy Lever, David Barber. “Nesterov’s Accelerated Gradient and Momentum as approximations to Regularised Update Descent”, 7 Jul 2016
Attributes:
param_groups
state

Methods

add_param_group(param_group) Add a param group to the Optimizer s param_groups.
backward(loss) This method comes from FP16 Optimizer, for consistency we add it everywhere
defaults() Specifies the hyper parameters defaults
get_space() Specifies the hyper parameters that are supported by this optimizer
load_state_dict(state_dict[, strict]) Loads the optimizer state.
state_dict([destination, prefix, keep_vars]) Returns the state of the optimizer as a dict.
step([closure]) Performs a single optimization step (parameter update).
zero_grad() Sets the gradients of all optimized torch.Tensor s to zero.
static defaults()[source]

Specifies the hyper parameters defaults

static get_space()[source]

Specifies the hyper parameters that are supported by this optimizer