SGD¶

class olympus.optimizers.sgd.SGD(model_parameters, weight_decay, lr, momentum)[source]¶

SGD with momentum, more on wikipedia

References

[Rb85e64dbf726-1]

Aleksandar Botev, Guy Lever, David Barber. “Nesterov’s Accelerated Gradient and Momentum as approximations to Regularised Update Descent”, 7 Jul 2016

Attributes:	param_groups state

Methods

`add_param_group`(self, param_group)	Add a param group to the `Optimizer` s `param_groups`.
`backward`(self, loss)	This method comes from FP16 Optimizer, for consistency we add it everywhere
`defaults`()	Specifies the hyper parameters defaults
`get_space`()	Specifies the hyper parameters that are supported by this optimizer
`load_state_dict`(self, state_dict[, strict])	Loads the optimizer state.
`state_dict`(self[, destination, prefix, …])	Returns the state of the optimizer as a `dict`.
`step`(self[, closure])	Performs a single optimization step (parameter update).
`zero_grad`(self)	Clears the gradients of all optimized `torch.Tensor` s.

static get_space()[source]¶: Specifies the hyper parameters that are supported by this optimizer