layers

sorix.nn.layers ¶

Linear ¶

Linear(
    features, neurons, bias=True, init="he", device="cpu"
)

Bases: Module

Applies a linear transformation to the incoming data.

Attributes:

W (Tensor) –

Weights of the layer.
b (Tensor) –

Biases of the layer.

Examples:

layer = Linear(10, 5)
x = tensor(np.random.randn(8, 10))
y = layer(x)
print(y.shape)  # (8, 5)

Source code in sorix/nn/layers.py

def __init__(
    self, 
    features: int, 
    neurons: int,
    bias: bool = True, 
    init: str = 'he',
    device: str = 'cpu'
) -> None:
    super().__init__()
    if device == 'cuda' and not _cupy_available:
        raise Exception('Cupy is not available')

    self.device = device
    xp = self.xp

    if init not in ['he', 'xavier']:
        raise ValueError(f'Invalid initialization method: {init}. Valid methods are "he" and "xavier"')

    if init == 'he':
        self.std_dev = xp.sqrt(2.0 / features)  # He init for ReLU
    elif init == 'xavier':
        self.std_dev = xp.sqrt(2.0 / (features + neurons))  # Xavier init for tanh

    self.bias = bias
    self.W = tensor(xp.random.normal(0, self.std_dev, size=(features, neurons)), 
                    device=self.device, requires_grad=True, dtype=float32)
    self.b = tensor(xp.zeros((1, neurons)), 
                    device=self.device, requires_grad=True, dtype=float32) if self.bias else None

coef_ `property` ¶

coef_

Returns weights as a flattened numpy array (Scikit-Learn parity).

intercept_ `property` ¶

intercept_

Returns biases as a flattened numpy array or scalar (Scikit-Learn parity).

ReLU ¶

ReLU()

Bases: Module

Rectified Linear Unit activation function.

Source code in sorix/nn/net.py

def __init__(self) -> None:
    super().__init__()
    self.device: str = 'cpu'
    self.training: bool = True

Sigmoid ¶

Sigmoid()

Bases: Module

Numerically stable Sigmoid activation function.

Source code in sorix/nn/net.py

def __init__(self) -> None:
    super().__init__()
    self.device: str = 'cpu'
    self.training: bool = True

Tanh ¶

Tanh()

Bases: Module

Hyperbolic tangent activation function.

Source code in sorix/nn/net.py

def __init__(self) -> None:
    super().__init__()
    self.device: str = 'cpu'
    self.training: bool = True

BatchNorm1d ¶

BatchNorm1d(
    num_features, eps=1e-05, momentum=0.1, device="cpu"
)

Bases: Module

Applies Batch Normalization over a 2D input.

Source code in sorix/nn/layers.py

def __init__(
    self, 
    num_features: int, 
    eps: float = 1e-5, 
    momentum: float = 0.1, 
    device: str = 'cpu'
) -> None:
    super().__init__()
    self.device = device
    xp = self.xp

    self.gamma = tensor(xp.ones((1, num_features)), requires_grad=True, dtype=float32)
    self.beta = tensor(xp.zeros((1, num_features)), requires_grad=True, dtype=float32)

    # buffers (captured by state_dict)
    self.running_mean = tensor(xp.zeros((1, num_features)), requires_grad=False, dtype=float32)
    self.running_var = tensor(xp.ones((1, num_features)), requires_grad=False, dtype=float32)

    self.momentum = momentum
    self.eps = eps
    self.device = device

    if self.device != 'cpu':
        self.to(self.device)

Dropout ¶

Dropout(p=0.5)

Bases: Module

During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution.

This implementation uses Inverted Dropout, meaning that the output is scaled by 1/(1-p) during training. This ensures that the expected value of the activations remains constant, allowing the layer to act as an identity function during inference.

Parameters:

p (float, default: 0.5 ) –

Probability of an element to be zeroed. Default: 0.5

Source code in sorix/nn/layers.py

def __init__(self, p: float = 0.5) -> None:
    super().__init__()
    self.p = p

layers

sorix.nn.layers ¶

Linear ¶

coef_ property ¶

intercept_ property ¶

ReLU ¶

Sigmoid ¶

Tanh ¶

BatchNorm1d ¶

Dropout ¶

coef_ `property` ¶

intercept_ `property` ¶