Creates a Momentum SGD learner instance to learn the parameters.

learner_momentum_sgd(parameters, lr, momentum,
  unit_gain = cntk$default_unit_gain_value(), l1_regularization_weight = 0,
  l2_regularization_weight = 0, gaussian_noise_injection_std_dev = 0,
  gradient_clipping_threshold_per_sample = np$inf,
  gradient_clipping_with_truncation = TRUE, use_mean_gradient = FALSE)

Arguments

parameters

– list of network parameters list of network parameters to tune.

lr

(output of learning_rate_schedule()) – learning rate schedule output of learning_rate_schedule

momentum

output of momentum_schdule or momentum_as_time_constant_schedule

unit_gain

logical whether to interpret momentum as a unit-gain filter

l1_regularization_weight

(float, optional) – the L1 regularization weight per sample, defaults to 0.0(float, optional) double of l1 regularization

l2_regularization_weight

(float, optional) – the L2 regularization weight per sample, defaults to 0.0 double of l2 regularization

gaussian_noise_injection_std_dev

(float, optional) – the standard deviation of the Gaussian noise added to parameters post update, defaults to 0.0 double of noise injection

gradient_clipping_threshold_per_sample

(float, optional) – clipping threshold per sample, defaults to infinity. double of gradient clipping threshold per sample

gradient_clipping_with_truncation

(bool, default True) – use gradient clipping with truncation logical for gradient clipping with truncation

use_mean_gradient

(bool, default False) – use averaged gradient as input to learner. Defaults to the value returned by default_use_mean_gradient_value(). logical use averaged gradient as input to learner.

References

https://www.cntk.ai/pythondocs/cntk.learners.html#cntk.learners.momentum_sgd