inox.nn.linear#

Linear layers

Classes#

Linear

Creates a linear layer.

Conv

Creates a convolution layer.

ConvTransposed

Creates a transposed convolution layer.

Descriptions#

class inox.nn.linear.Linear(in_features, out_features, bias=True, key=None)#

Creates a linear layer.

\[y = W x + b\]
Parameters:
  • in_features (int) – The number of input features \(C\).

  • out_features (int) – The number of output features \(C'\).

  • bias (bool) – Whether the layer learns an additive bias \(b\) or not.

  • key (Array) – A PRNG key for initialization. If None, inox.random.get_rng is used instead.

__call__(x)#
Parameters:

x (Array) – The input vector \(x\), with shape \((*, C)\).

Returns:

The output vector \(y\), with shape \((*, C')\).

Return type:

Array

class inox.nn.linear.Conv(in_channels, out_channels, kernel_size, bias=True, stride=1, dilation=1, padding=0, groups=1, key=None)#

Creates a convolution layer.

\[y = W * x + b\]

References

A guide to convolution arithmetic for deep learning (Dumoulin et al., 2016)
Parameters:
  • in_channels (int) – The number of input channels \(C\).

  • out_channels (int) – The number of output channels \(C'\).

  • kernel_size (Sequence[int]) – The size of the kernel \(W\) in each spatial axis.

  • bias (bool) – Whether the layer learns an additive bias \(b\) or not.

  • stride (int | Sequence[int]) – The stride coefficient in each spatial axis.

  • dilation (int | Sequence[int]) – The dilation coefficient in each spatial axis.

  • padding (int | Sequence[Tuple[int, int]]) – The padding applied to each end of each spatial axis.

  • groups (int) – The number of channel groups \(G\). Both \(C\) and \(C'\) must be divisible by \(G\).

  • key (Array) – A PRNG key for initialization. If None, inox.random.get_rng is used instead.

__call__(x)#
Parameters:

x (Array) – The input tensor \(x\), with shape \((*, H_1, \dots, H_n, C)\).

Returns:

The output tensor \(y\), with shape \((*, H_1', \dots, H_n', C')\), such that

\[H_i' = \left\lfloor \frac{H_i - d_i \times (k_i - 1) + p_i}{s_i} + 1 \right\rfloor\]

where \(k_i\), \(s_i\), \(d_i\) and \(p_i\) are respectively the kernel size, the stride coefficient, the dilation coefficient and the total padding of the \(i\)-th spatial axis.

Return type:

Array

class inox.nn.linear.ConvTransposed(in_channels, out_channels, kernel_size, bias=True, stride=1, dilation=1, padding=0, groups=1, key=None)#

Creates a transposed convolution layer.

This layer can be seen as the gradient of Conv with respect to its input. It is also known as a “deconvolution”, altough it does not actually compute the inverse of a convolution.

References

A guide to convolution arithmetic for deep learning (Dumoulin et al., 2016)
Parameters:
  • in_channels (int) – The number of input channels \(C\).

  • out_channels (int) – The number of output channels \(C'\).

  • kernel_size (Sequence[int]) – The size of the kernel \(W\) in each spatial axis.

  • bias (bool) – Whether the layer learns an additive bias \(b\) or not.

  • stride (int | Sequence[int]) – The stride coefficient in each spatial axis.

  • dilation (int | Sequence[int]) – The dilation coefficient in each spatial axis.

  • padding (int | Sequence[Tuple[int, int]]) – The padding applied to each end of each spatial axis.

  • groups (int) – The number of channel groups \(G\). Both \(C\) and \(C'\) must be divisible by \(G\).

  • key (Array) – A PRNG key for initialization. If None, inox.random.get_rng is used instead.

__call__(x)#
Parameters:

x (Array) – The input tensor \(x\), with shape \((*, H_1, \dots, H_n, C)\).

Returns:

The output tensor \(y\), with shape \((*, H_1', \dots, H_n', C')\), such that

\[H_i' = (H_i - 1) \times s_i + d_i \times (k_i - 1) - p_i + 1\]

where \(k_i\), \(s_i\), \(d_i\) and \(p_i\) are respectively the kernel size, the stride coefficient, the dilation coefficient and the total padding of the \(i\)-th spatial axis.

Return type:

Array