Tanh¶
The Tanh layer implements the hyperbolic tangent activation function. It is a symmetric transformation that maps input values into the range $(-1, 1)$. Being zero-centered, its outputs have an average mean closer to zero compared to the Sigmoid function, which often facilitates faster training of deep models.
Mathematical definition¶
For an input $x \in \mathbb{R}$, the Tanh function is defined as:
$$ \tanh(x) = \frac{e^{x} - e^{-x}}{e^{x} + e^{-x}}. $$
Backward computation (gradient)¶
The derivative of the hyperbolic tangent function can be expressed elegantly in terms of its output $y = \tanh(x)$: $$\frac{d\tanh(x)}{dx} = 1 - y^2$$
During backpropagation, the gradient is propagated as: $$\frac{\partial \mathcal{L}}{\partial x} = \frac{\partial \mathcal{L}}{\partial y} \cdot (1 - y^2)$$
In [1]:
Copied!
# Uncomment the next line and run this cell to install sorix
#!pip install 'sorix @ git+https://github.com/Mitchell-Mirano/sorix.git@main'
# Uncomment the next line and run this cell to install sorix
#!pip install 'sorix @ git+https://github.com/Mitchell-Mirano/sorix.git@main'
In [2]:
Copied!
import numpy as np
import matplotlib.pyplot as plt
from sorix import tensor
from sorix.nn import Tanh
import sorix
plt.style.use('ggplot')
import numpy as np
import matplotlib.pyplot as plt
from sorix import tensor
from sorix.nn import Tanh
import sorix
plt.style.use('ggplot')
Visualizing Tanh¶
In [3]:
Copied!
x_vals = np.linspace(-5, 5, 100)
X = tensor(x_vals, requires_grad=True)
tanh = Tanh()
Y = tanh(X)
plt.figure(figsize=(10, 5))
plt.plot(x_vals, Y.numpy(), label='$Tanh(x) = \\tanh(x)$', color='#3498db', lw=2)
plt.axhline(0, color='black', lw=1, ls='--')
plt.title("Tanh Activation Function")
plt.xlabel("x")
plt.ylabel("$\\tanh(x)$")
plt.grid(True, alpha=0.3)
plt.legend()
plt.show()
x_vals = np.linspace(-5, 5, 100)
X = tensor(x_vals, requires_grad=True)
tanh = Tanh()
Y = tanh(X)
plt.figure(figsize=(10, 5))
plt.plot(x_vals, Y.numpy(), label='$Tanh(x) = \\tanh(x)$', color='#3498db', lw=2)
plt.axhline(0, color='black', lw=1, ls='--')
plt.title("Tanh Activation Function")
plt.xlabel("x")
plt.ylabel("$\\tanh(x)$")
plt.grid(True, alpha=0.3)
plt.legend()
plt.show()
Functional Example¶
In [4]:
Copied!
X = tensor([-3.0, -1.0, 0.0, 1.0, 3.0], requires_grad=True)
Y = tanh(X)
Y.sum().backward()
print(f"Input: {X.numpy()}")
print(f"Output: {Y.numpy()}")
print(f"Gradients: {X.grad}")
X = tensor([-3.0, -1.0, 0.0, 1.0, 3.0], requires_grad=True)
Y = tanh(X)
Y.sum().backward()
print(f"Input: {X.numpy()}")
print(f"Output: {Y.numpy()}")
print(f"Gradients: {X.grad}")
Input: [-3. -1. 0. 1. 3.] Output: [-0.9950548 -0.7615942 0. 0.7615942 0.9950548] Gradients: tensor([0.009866 , 0.41997433, 1. , 0.41997433, 0.009866 ])