inox.nn.linear#
Linear layers
Classes#
Creates a linear layer. |
|
Creates a convolution layer. |
|
Creates a transposed convolution layer. |
Descriptions#
- class inox.nn.linear.Linear(in_features, out_features, bias=True, key=None)#
Creates a linear layer.
\[y = W x + b\]- Parameters:
in_features (int) – The number of input features \(C\).
out_features (int) – The number of output features \(C'\).
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
key (Array) – A PRNG key for initialization. If
None
,inox.random.get_rng
is used instead.
- class inox.nn.linear.Conv(in_channels, out_channels, kernel_size, bias=True, stride=1, dilation=1, padding=0, groups=1, key=None)#
Creates a convolution layer.
\[y = W * x + b\]References
A guide to convolution arithmetic for deep learning (Dumoulin et al., 2016)- Parameters:
in_channels (int) – The number of input channels \(C\).
out_channels (int) – The number of output channels \(C'\).
kernel_size (Sequence[int]) – The size of the kernel \(W\) in each spatial axis.
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
stride (int | Sequence[int]) – The stride coefficient in each spatial axis.
dilation (int | Sequence[int]) – The dilation coefficient in each spatial axis.
padding (int | Sequence[Tuple[int, int]]) – The padding applied to each end of each spatial axis.
groups (int) – The number of channel groups \(G\). Both \(C\) and \(C'\) must be divisible by \(G\).
key (Array) – A PRNG key for initialization. If
None
,inox.random.get_rng
is used instead.
- __call__(x)#
- Parameters:
x (Array) – The input tensor \(x\), with shape \((*, H_1, \dots, H_n, C)\).
- Returns:
The output tensor \(y\), with shape \((*, H_1', \dots, H_n', C')\), such that
\[H_i' = \left\lfloor \frac{H_i - d_i \times (k_i - 1) + p_i}{s_i} + 1 \right\rfloor\]where \(k_i\), \(s_i\), \(d_i\) and \(p_i\) are respectively the kernel size, the stride coefficient, the dilation coefficient and the total padding of the \(i\)-th spatial axis.
- Return type:
- class inox.nn.linear.ConvTransposed(in_channels, out_channels, kernel_size, bias=True, stride=1, dilation=1, padding=0, groups=1, key=None)#
Creates a transposed convolution layer.
This layer can be seen as the gradient of
Conv
with respect to its input. It is also known as a “deconvolution”, altough it does not actually compute the inverse of a convolution.References
A guide to convolution arithmetic for deep learning (Dumoulin et al., 2016)- Parameters:
in_channels (int) – The number of input channels \(C\).
out_channels (int) – The number of output channels \(C'\).
kernel_size (Sequence[int]) – The size of the kernel \(W\) in each spatial axis.
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
stride (int | Sequence[int]) – The stride coefficient in each spatial axis.
dilation (int | Sequence[int]) – The dilation coefficient in each spatial axis.
padding (int | Sequence[Tuple[int, int]]) – The padding applied to each end of each spatial axis.
groups (int) – The number of channel groups \(G\). Both \(C\) and \(C'\) must be divisible by \(G\).
key (Array) – A PRNG key for initialization. If
None
,inox.random.get_rng
is used instead.
- __call__(x)#
- Parameters:
x (Array) – The input tensor \(x\), with shape \((*, H_1, \dots, H_n, C)\).
- Returns:
The output tensor \(y\), with shape \((*, H_1', \dots, H_n', C')\), such that
\[H_i' = (H_i - 1) \times s_i + d_i \times (k_i - 1) - p_i + 1\]where \(k_i\), \(s_i\), \(d_i\) and \(p_i\) are respectively the kernel size, the stride coefficient, the dilation coefficient and the total padding of the \(i\)-th spatial axis.
- Return type: