Computes the gradient of f(z)=log∑iexp(zi)f(z)=log⁡∑iexp⁡(zi) at z = x. Concretely,

op_softmax(x, axis = NULL, name = "")

Arguments

x

matrix or CNTK Function that outputs a tensor

axis

axis across which to perform operation

name

(str) the name of the Function instance in the network

Details

softmax(x)=[exp(x1)∑iexp(xi)exp(x1)∑iexp(xi)…exp(x1)∑iexp(xi)]softmax(x)=[exp⁡(x1)∑iexp⁡(xi)exp⁡(x1)∑iexp⁡(xi)…exp⁡(x1)∑iexp⁡(xi)]

with the understanding that the implementation can use equivalent formulas for efficiency and numerical stability.

The output is a vector of non-negative numbers that sum to 1 and can therefore be interpreted as probabilities for mutually exclusive outcomes as in the case of multiclass classification.

If axis is given, the softmax will be computed along that axis.