Creates a Momentum SGD learner instance to learn the parameters.
learner_momentum_sgd(parameters, lr, momentum, unit_gain = cntk$default_unit_gain_value(), l1_regularization_weight = 0, l2_regularization_weight = 0, gaussian_noise_injection_std_dev = 0, gradient_clipping_threshold_per_sample = np$inf, gradient_clipping_with_truncation = TRUE, use_mean_gradient = FALSE)
parameters | – list of network parameters list of network parameters to tune. |
---|---|
lr | (output of learning_rate_schedule()) – learning rate schedule output of |
momentum | output of |
unit_gain | logical whether to interpret momentum as a unit-gain filter |
l1_regularization_weight | (float, optional) – the L1 regularization weight per sample, defaults to 0.0(float, optional) double of l1 regularization |
l2_regularization_weight | (float, optional) – the L2 regularization weight per sample, defaults to 0.0 double of l2 regularization |
gaussian_noise_injection_std_dev | (float, optional) – the standard deviation of the Gaussian noise added to parameters post update, defaults to 0.0 double of noise injection |
gradient_clipping_threshold_per_sample | (float, optional) – clipping threshold per sample, defaults to infinity. double of gradient clipping threshold per sample |
gradient_clipping_with_truncation | (bool, default True) – use gradient clipping with truncation logical for gradient clipping with truncation |
use_mean_gradient | (bool, default False) – use averaged gradient as input to learner. Defaults to the value returned by default_use_mean_gradient_value(). logical use averaged gradient as input to learner. |
https://www.cntk.ai/pythondocs/cntk.learners.html#cntk.learners.momentum_sgd