inox.nn.linear#

Linear layers

Classes#

`Linear`	Creates a linear layer.
`Conv`	Creates a convolution layer.
`ConvTransposed`	Creates a transposed convolution layer.

Descriptions#

class inox.nn.linear.Linear(in_features, out_features, bias=True, key=None)#

Creates a linear layer.

\[y = W x + b\]

Parameters:

in_features (int) – The number of input features \(C\).
out_features (int) – The number of output features \(C'\).
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
key (Array) – A PRNG key for initialization. If None, inox.random.get_rng is used instead.

__call__(x)#

Parameters:: x (Array) – The input vector \(x\), with shape \((*, C)\).
Returns:: The output vector \(y\), with shape \((*, C')\).
Return type:: Array

class inox.nn.linear.Conv(in_channels, out_channels, kernel_size, bias=True, stride=1, dilation=1, padding=0, groups=1, key=None)#

Creates a convolution layer.

\[y = W * x + b\]

References

A guide to convolution arithmetic for deep learning (Dumoulin et al., 2016)

https://arxiv.org/abs/1603.07285

Parameters:

in_channels (int) – The number of input channels \(C\).
out_channels (int) – The number of output channels \(C'\).
kernel_size (Sequence[int]) – The size of the kernel \(W\) in each spatial axis.
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
stride (int | Sequence[int]) – The stride coefficient in each spatial axis.
dilation (int | Sequence[int]) – The dilation coefficient in each spatial axis.
padding (int | Sequence[Tuple[int, int]]) – The padding applied to each end of each spatial axis.
groups (int) – The number of channel groups \(G\). Both \(C\) and \(C'\) must be divisible by \(G\).
key (Array) – A PRNG key for initialization. If None, inox.random.get_rng is used instead.

__call__(x)#

Parameters:

x (Array) – The input tensor \(x\), with shape \((*, H_1, \dots, H_n, C)\).

Returns:

The output tensor \(y\), with shape \((*, H_1', \dots, H_n', C')\), such that

\[H_i' = \left\lfloor \frac{H_i - d_i \times (k_i - 1) + p_i}{s_i} + 1 \right\rfloor\]

where \(k_i\), \(s_i\), \(d_i\) and \(p_i\) are respectively the kernel size, the stride coefficient, the dilation coefficient and the total padding of the \(i\)-th spatial axis.

Return type:

Array

class inox.nn.linear.ConvTransposed(in_channels, out_channels, kernel_size, bias=True, stride=1, dilation=1, padding=0, groups=1, key=None)#

Creates a transposed convolution layer.

This layer can be seen as the gradient of Conv with respect to its input. It is also known as a “deconvolution”, altough it does not actually compute the inverse of a convolution.

References

A guide to convolution arithmetic for deep learning (Dumoulin et al., 2016)

https://arxiv.org/abs/1603.07285

Parameters:

in_channels (int) – The number of input channels \(C\).
out_channels (int) – The number of output channels \(C'\).
kernel_size (Sequence[int]) – The size of the kernel \(W\) in each spatial axis.
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
stride (int | Sequence[int]) – The stride coefficient in each spatial axis.
dilation (int | Sequence[int]) – The dilation coefficient in each spatial axis.
padding (int | Sequence[Tuple[int, int]]) – The padding applied to each end of each spatial axis.
groups (int) – The number of channel groups \(G\). Both \(C\) and \(C'\) must be divisible by \(G\).
key (Array) – A PRNG key for initialization. If None, inox.random.get_rng is used instead.

__call__(x)#

Parameters:

x (Array) – The input tensor \(x\), with shape \((*, H_1, \dots, H_n, C)\).

Returns:

The output tensor \(y\), with shape \((*, H_1', \dots, H_n', C')\), such that

\[H_i' = (H_i - 1) \times s_i + d_i \times (k_i - 1) - p_i + 1\]

where \(k_i\), \(s_i\), \(d_i\) and \(p_i\) are respectively the kernel size, the stride coefficient, the dilation coefficient and the total padding of the \(i\)-th spatial axis.

Return type:

Array