inox.nn.normalization#

Normalization layers

Classes#

`BatchNorm`	Creates a batch-normalization layer.
`LayerNorm`	Creates a layer-normalization layer.
`GroupNorm`	Creates a group-normalization layer.

Descriptions#

class inox.nn.normalization.BatchNorm(channels, epsilon=1e-05, momentum=0.9)#

Creates a batch-normalization layer.

\[y = \frac{x - \mathbb{E}[x]}{\sqrt{\mathbb{V}[x] + \epsilon}}\]

The mean and variance are calculated over the batch and spatial axes. During training, the layer keeps running estimates of the mean and variance, which are then used for normalization during evaluation. The update rule for a running average statistic \(\hat{s}\) is

\[\hat{s} \gets \alpha \hat{s} + (1 - \alpha) s\]

where \(s\) is the statistic calculated for the current batch.

References

Accelerating Deep Network Training by Reducing Internal Covariate Shift (Ioffe et al., 2015)

https://arxiv.org/abs/1502.03167

Parameters:

channels (int) – The number of channels \(C\).
epsilon (float | Array) – A numerical stability term \(\epsilon\).
momentum (float | Array) – The momentum \(\alpha \in [0, 1]\) for the running estimates.

__call__(x, state)#

Parameters:

x (Array) – The input tensor \(x\), with shape \((N, *, C)\).
state (Dict) – The state dictionary.

Returns:

The output tensor \(y\), with shape \((N, *, C)\), and the (updated) state dictionary.

Return type:

Tuple[Array, Dict]

class inox.nn.normalization.LayerNorm(axis=-1, epsilon=1e-05)#

Creates a layer-normalization layer.

\[y = \frac{x - \mathbb{E}[x]}{\sqrt{\mathbb{V}[x] + \epsilon}}\]

References

Layer Normalization (Ba et al., 2016)

https://arxiv.org/abs/1607.06450

Parameters:

axis (int | Sequence[int]) – The axis(es) over which the mean and variance are calculated.
epsilon (float | Array) – A numerical stability term \(\epsilon\).

__call__(x)#

Parameters:: x (Array) – The input tensor \(x\), with shape \((*, C)\).
Returns:: The output tensor \(y\), with shape \((*, C)\).
Return type:: Array

class inox.nn.normalization.GroupNorm(groups, epsilon=1e-05)#

Creates a group-normalization layer.

References

Group Normalization (Wu et al., 2018)

https://arxiv.org/abs/1803.08494

Parameters:

groups (int) – The number of groups \(G\) to separate channels into. If \(G = 1\), the layer is equivalent to LayerNorm.
epsilon (float | Array) – A numerical stability term \(\epsilon\).

__call__(x)#

Parameters:: x (Array) – The input tensor \(x\), with shape \((*, C)\).
Returns:: The output tensor \(y\), with shape \((*, C)\).
Return type:: Array