inox.nn.normalization#

Normalization layers

Classes#

BatchNorm

Creates a batch-normalization layer.

LayerNorm

Creates a layer-normalization layer.

GroupNorm

Creates a group-normalization layer.

Descriptions#

class inox.nn.normalization.BatchNorm(channels, epsilon=1e-05, momentum=0.9)#

Creates a batch-normalization layer.

\[y = \frac{x - \mathbb{E}[x]}{\sqrt{\mathbb{V}[x] + \epsilon}}\]

The mean and variance are calculated over the batch and spatial axes. During training, the layer keeps running estimates of the mean and variance, which are then used for normalization during evaluation. The update rule for a running average statistic \(\hat{s}\) is

\[\hat{s} \gets \alpha \hat{s} + (1 - \alpha) s\]

where \(s\) is the statistic calculated for the current batch.

References

Accelerating Deep Network Training by Reducing Internal Covariate Shift (Ioffe et al., 2015)
Parameters:
  • channels (int) – The number of channels \(C\).

  • epsilon (float | Array) – A numerical stability term \(\epsilon\).

  • momentum (float | Array) – The momentum \(\alpha \in [0, 1]\) for the running estimates.

__call__(x, state)#
Parameters:
  • x (Array) – The input tensor \(x\), with shape \((N, *, C)\).

  • state (Dict) – The state dictionary.

Returns:

The output tensor \(y\), with shape \((N, *, C)\), and the (updated) state dictionary.

Return type:

Tuple[Array, Dict]

class inox.nn.normalization.LayerNorm(axis=-1, epsilon=1e-05)#

Creates a layer-normalization layer.

\[y = \frac{x - \mathbb{E}[x]}{\sqrt{\mathbb{V}[x] + \epsilon}}\]

References

Layer Normalization (Ba et al., 2016)
Parameters:
  • axis (int | Sequence[int]) – The axis(es) over which the mean and variance are calculated.

  • epsilon (float | Array) – A numerical stability term \(\epsilon\).

__call__(x)#
Parameters:

x (Array) – The input tensor \(x\), with shape \((*, C)\).

Returns:

The output tensor \(y\), with shape \((*, C)\).

Return type:

Array

class inox.nn.normalization.GroupNorm(groups, epsilon=1e-05)#

Creates a group-normalization layer.

References

Group Normalization (Wu et al., 2018)
Parameters:
  • groups (int) – The number of groups \(G\) to separate channels into. If \(G = 1\), the layer is equivalent to LayerNorm.

  • epsilon (float | Array) – A numerical stability term \(\epsilon\).

__call__(x)#
Parameters:

x (Array) – The input tensor \(x\), with shape \((*, C)\).

Returns:

The output tensor \(y\), with shape \((*, C)\).

Return type:

Array