Each element of the input is independently set to 0 with probabily dropout_rate or to 1 / (1 - dropout_rate) times its original value (with probability 1-dropout_rate). Dropout is a good way to reduce overfitting.

op_dropout(x, dropout_rate = 0, seed = 4294967293, name = "")

Arguments

x

matrix or CNTK Function that outputs a tensor

name

(str) the name of the Function instance in the network

Details

This behavior only happens during training. During inference dropout is a no-op. In the paper that introduced dropout it was suggested to scale the weights during inference In CNTK’s implementation, because the values that are not set to 0 are multiplied with (1 / (1 - dropout_rate)), this is not necessary.