SGD¶

class olympus.optimizers.sgd.SGD(model_parameters, weight_decay, lr, momentum)[source]¶

SGD with momentum, more on wikipedia

References

[1]	Aleksandar Botev, Guy Lever, David Barber. “Nesterov’s Accelerated Gradient and Momentum as approximations to Regularised Update Descent”, 7 Jul 2016

Attributes:	param_groups state

Methods

`add_param_group`(param_group)	Add a param group to the `Optimizer` s `param_groups`.
`backward`(loss)	This method comes from FP16 Optimizer, for consistency we add it everywhere
`defaults`()	Specifies the hyper parameters defaults
`get_space`()	Specifies the hyper parameters that are supported by this optimizer
`load_state_dict`(state_dict[, strict])	Loads the optimizer state.
`state_dict`([destination, prefix, keep_vars])	Returns the state of the optimizer as a `dict`.
`step`([closure])	Performs a single optimization step (parameter update).
`zero_grad`()	Sets the gradients of all optimized `torch.Tensor` s to zero.

static get_space()[source]¶: Specifies the hyper parameters that are supported by this optimizer