# 06.6 用双曲正切函数分类

## 6.6 用双曲正切函数做二分类函数⚓︎

### 6.6.1 提出问题⚓︎

a_i=Logisitc(z_i) = \frac{1}{1 + e^{-z_i}} \tag{1}
loss_i(w,b)=-[y_i \ln a_i + (1-y_i) \ln (1-a_i)] \tag{2}

Tanh(z) = \frac{e^{z} - e^{-z}}{e^{z} + e^{-z}} = \frac{2}{1 + e^{-2z}} - 1 \tag{3}

### 6.6.2 修改前向计算和反向传播函数⚓︎

#### 增加双曲正切分类函数⚓︎

def Tanh(z):
a = 2.0 / (1.0 + np.exp(-2*z)) - 1.0
return a


#### 修改前向计算方法⚓︎

class TanhNeuralNet(NeuralNet):
def forwardBatch(self, batch_x):
Z = np.dot(batch_x, self.W) + self.B
if self.params.net_type == NetType.BinaryClassifier:
A = Sigmoid().forward(Z)
return A
elif self.params.net_type == NetType.BinaryTanh:
A = Tanh().forward(Z)
return A
else:
return Z


class NetType(Enum):
Fitting = 1,
BinaryClassifier = 2,
MultipleClassifier = 3,
BinaryTanh = 4,


#### 修改反向传播方法⚓︎

\frac{\partial{loss_i}}{\partial{a_i}}= \frac{a_i-y_i}{a_i(1-a_i)} \tag{4}

\frac{\partial{a_i}}{\partial{z_i}}=(1-a_i)(1+a_i) \tag{5}

\begin{aligned} \frac{\partial loss_i}{\partial z_i}&=\frac{\partial loss_i}{\partial a_i} \frac{\partial a_i}{\partial z_i} \\\\ &= \frac{a_i-y_i}{a_i(1-a_i)} (1+a_i)(1-a_i) \\\\ &= \frac{(a_i-y_i)(1+a_i)}{a_i} \end{aligned} \tag{6}

class TanhNeuralNet(NeuralNet):
def backwardBatch(self, batch_x, batch_y, batch_a):
m = batch_x.shape[0]
dZ = (batch_a - batch_y) * (1 + batch_a) / batch_a
dB = dZ.sum(axis=0, keepdims=True)/m
dW = np.dot(batch_x.T, dZ)/m
return dW, dB


epoch=0
Level4_TanhAsBinaryClassifier.py:29: RuntimeWarning: divide by zero encountered in true_divide
dZ = (batch_a - batch_y) * (1 + batch_a) / batch_a
Level4_TanhAsBinaryClassifier.py:29: RuntimeWarning: invalid value encountered in true_divide
dZ = (batch_a - batch_y) * (1 + batch_a) / batch_a
0 1 nan
0 3 nan
0 5 nan
......


1. 用对率函数，输出值域为 $(0,1)$，所以a值永远会大于0，不可能为0。而Tanh函数的输出值域是 $(-1,1)$，有可能是0；
2. 以前的误差项 dZ = batch_a - batch_y，并没有除法项。

### 6.6.3 修改损失函数⚓︎

loss_i=-[y_i \ln a_i + (1-y_i) \ln (1-a_i)]

loss_i=-[(1+y_i) \ln (1+a_i) + (1-y_i) \ln (1-a_i)] \tag{7}

\frac{\partial loss}{\partial a_i} = \frac{2(a_i-y_i)}{(1+a_i)(1-a_i)} \tag{8}

\begin{aligned} \frac{\partial loss_i}{\partial z_i}&=\frac{\partial loss_i}{\partial a_i}\frac{\partial a_i}{\partial z_i} \\\\ &=\frac{2(a_i-y_i)}{(1+a_i)(1-a_i)} (1+a_i)(1-a_i) \\\\ &=2(a_i-y_i) \end{aligned} \tag{9}

#### 增加新的损失函数⚓︎

class LossFunction(object):
def CE2_tanh(self, A, Y, count):
p = (1-Y) * np.log(1-A) + (1+Y) * np.log(1+A)
LOSS = np.sum(-p)
loss = LOSS / count
return loss


#### 修改反向传播方法⚓︎

class NeuralNet(object):
def backwardBatch(self, batch_x, batch_y, batch_a):
m = batch_x.shape[0]
# setp 1 - use original cross-entropy function
#        dZ = (batch_a - batch_y) * (1 + batch_a) / batch_a
# step 2 - modify cross-entropy function
dZ = 2 * (batch_a - batch_y)
......


epoch=0
0 1 -0.1882585728753378
W= [[0.04680528]
[0.10793676]]
B= [[0.16576018]]
A= [[0.28416676]
[0.24881074]
[0.21204905]]
w12= -0.4336361115243373
b12= -1.5357156668786782


loss_i=-[y_i \ln(a_i)+(1-y_i) \ln (1-a_i)] \tag{2}

loss_i=-[(1+y_i) \ln (1+a_i) + (1-y_i) \ln (1-a_i)] \tag{7}

Tanh函数输出值 $a$$(-1,1)$，这样$1+a \in (0,2)$$1-a \in (0,2)$，当处于(1,2)区间时，$ln(1+a)$$ln(1-a)$的值大于0，最终导致loss为负数。如果仍然想用交叉熵函数，必须符合其原始设计思想，让 $1+a$$1-a$ 都在 $(0,1)$ 值域内！

### 6.6.4 再次修改损失函数代码⚓︎

loss_i=-[(1+y_i) \ln (\frac{1+a_i}{2})+(1-y_i) \ln (\frac{1-a_i}{2})] \tag{9}

\frac{\partial loss_i}{\partial z_i} =2(a_i-y_i) \tag{8}
class LossFunction(object):
def CE2_tanh(self, A, Y, count):
#p = (1-Y) * np.log(1-A) + (1+Y) * np.log(1+A)
p = (1-Y) * np.log((1-A)/2) + (1+Y) * np.log((1+A)/2)
......


### 6.6.5 修改样本数据标签值⚓︎

SimpleDataReader类上派生出子类SimpleDataReader_tanh，并增加一个ToZeroOne()方法，目的是把原来的[0/1]标签变成[-1/1]标签。

class SimpleDataReader_tanh(SimpleDataReader):
def ToZeroOne(self):
Y = np.zeros((self.num_train, 1))
for i in range(self.num_train):
if self.YTrain[i,0] == 0:     # 第一类的标签设为0
Y[i,0] = -1
elif self.YTrain[i,0] == 1:   # 第二类的标签设为1
Y[i,0] = 1
......


def draw_predicate_data(net):
......
for i in range(3):
# if a[i,0] > 0.5:  # logistic function
if a[i,0] > 0:      # tanh function
......


if __name__ == '__main__':
......
reader.ToZeroOne()  # change lable value from 0/1 to -1/1
# net
params = HyperParameters(eta=0.1, max_epoch=100, batch_size=10, eps=1e-3, net_type=NetType.BinaryTanh)
......
net = TanhNeuralNet(params, num_input, num_output)
......


ch06, Level5