MSELoss¶
The Mean Squared Error (MSE) loss measures the average of the squares of the errors. It is the most common loss function for Regression tasks.
$$MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$
Where:
- $n$ is the batch size.
- $y_i$ is the target value.
- $\hat{y}_i$ is the predicted value.
In [1]:
Copied!
# Uncomment the next line and run this cell to install sorix
#!pip install 'sorix @ git+https://github.com/Mitchell-Mirano/sorix.git@main'
# Uncomment the next line and run this cell to install sorix
#!pip install 'sorix @ git+https://github.com/Mitchell-Mirano/sorix.git@main'
In [2]:
Copied!
import numpy as np
from sorix import tensor
from sorix.nn import MSELoss
# Create data
y_pred = tensor([2.5, 0.0, 2.1], requires_grad=True)
y_true = tensor([3.0, 0.0, 2.0])
criterion = MSELoss()
loss = criterion(y_pred, y_true)
print(f"Predictions: {y_pred.numpy()}")
print(f"Targets: {y_true.numpy()}")
print(f"MSE Loss: {loss.item():.4f}")
import numpy as np
from sorix import tensor
from sorix.nn import MSELoss
# Create data
y_pred = tensor([2.5, 0.0, 2.1], requires_grad=True)
y_true = tensor([3.0, 0.0, 2.0])
criterion = MSELoss()
loss = criterion(y_pred, y_true)
print(f"Predictions: {y_pred.numpy()}")
print(f"Targets: {y_true.numpy()}")
print(f"MSE Loss: {loss.item():.4f}")
Predictions: [2.5 0. 2.1] Targets: [3. 0. 2.] MSE Loss: 0.0867
Verification with Autograd¶
MSELoss in Sorix is fully differentiable. If we compute the backward pass, we can see the gradients w.r.t the predictions.
In [3]:
Copied!
loss.backward()
print(f"Gradients w.r.t y_pred: {y_pred.grad}")
# Manual verification: d/dy_pred (1/n * (y_pred - y_true)^2) = 2/n * (y_pred - y_true)
n = y_pred.data.size
manual_grad = 2/n * (y_pred.data - y_true.data)
print(f"Manual Gradients: {manual_grad}")
loss.backward()
print(f"Gradients w.r.t y_pred: {y_pred.grad}")
# Manual verification: d/dy_pred (1/n * (y_pred - y_true)^2) = 2/n * (y_pred - y_true)
n = y_pred.data.size
manual_grad = 2/n * (y_pred.data - y_true.data)
print(f"Manual Gradients: {manual_grad}")
Gradients w.r.t y_pred: tensor([-0.33333334, 0. , 0.06666661], dtype=sorix.float64) Manual Gradients: [-0.33333334 0. 0.0666666 ]
Training Example¶
Let's see how MSELoss guides a single value to match a target.
In [4]:
Copied!
from sorix.optim import SGD
weight = tensor([10.0], requires_grad=True)
target = tensor([42.0])
optimizer = SGD([weight], lr=0.1)
print(f"Initial weight: {weight.item():.2f}")
for i in range(21):
loss = criterion(weight, target)
loss.backward()
optimizer.step()
optimizer.zero_grad()
if i % 5 == 0:
print(f"Step {i:2d} | Weight: {weight.item():.4f} | Loss: {loss.item():.4f}")
print(f"Final weight: {weight.item():.2f}")
from sorix.optim import SGD
weight = tensor([10.0], requires_grad=True)
target = tensor([42.0])
optimizer = SGD([weight], lr=0.1)
print(f"Initial weight: {weight.item():.2f}")
for i in range(21):
loss = criterion(weight, target)
loss.backward()
optimizer.step()
optimizer.zero_grad()
if i % 5 == 0:
print(f"Step {i:2d} | Weight: {weight.item():.4f} | Loss: {loss.item():.4f}")
print(f"Final weight: {weight.item():.2f}")
Initial weight: 10.00 Step 0 | Weight: 16.4000 | Loss: 1024.0000 Step 5 | Weight: 33.6114 | Loss: 109.9512 Step 10 | Weight: 39.2512 | Loss: 11.8059 Step 15 | Weight: 41.0993 | Loss: 1.2676 Step 20 | Weight: 41.7049 | Loss: 0.1361 Final weight: 41.70