Creates a Momentum SGD learner instance to learn the parameters.
learner_momentum_sgd(parameters, lr, momentum, unit_gain = cntk$default_unit_gain_value(), l1_regularization_weight = 0, l2_regularization_weight = 0, gaussian_noise_injection_std_dev = 0, gradient_clipping_threshold_per_sample = np$inf, gradient_clipping_with_truncation = TRUE, use_mean_gradient = FALSE)
| parameters | – list of network parameters list of network parameters to tune. |
|---|---|
| lr | (output of learning_rate_schedule()) – learning rate schedule output of |
| momentum | output of |
| unit_gain | logical whether to interpret momentum as a unit-gain filter |
| l1_regularization_weight | (float, optional) – the L1 regularization weight per sample, defaults to 0.0(float, optional) double of l1 regularization |
| l2_regularization_weight | (float, optional) – the L2 regularization weight per sample, defaults to 0.0 double of l2 regularization |
| gaussian_noise_injection_std_dev | (float, optional) – the standard deviation of the Gaussian noise added to parameters post update, defaults to 0.0 double of noise injection |
| gradient_clipping_threshold_per_sample | (float, optional) – clipping threshold per sample, defaults to infinity. double of gradient clipping threshold per sample |
| gradient_clipping_with_truncation | (bool, default True) – use gradient clipping with truncation logical for gradient clipping with truncation |
| use_mean_gradient | (bool, default False) – use averaged gradient as input to learner. Defaults to the value returned by default_use_mean_gradient_value(). logical use averaged gradient as input to learner. |
https://www.cntk.ai/pythondocs/cntk.learners.html#cntk.learners.momentum_sgd