Applies RMS(Root Mean Square) Normalization over a mini-batch of inputs as described in the paper Root Mean Square Layer Normalization
The mean is calculated over the last axis. For example, if axis = -2, the mean is computed over the last 2 dimensions of the input.
SkipRMSNorm is performed by the formula below:
axis to split the normalization dimension.
Whether apply SkipRMSNorm.
Input features.
Shape:
Transformation weight.
Shape:
Skip input.
Shape: same as X
Output features.
Shape: same as X
SkipOutput. If SkipIn is not appear, SkipOut will be a copy of X
Shape: same as X
If input is float16, data will convert to float32 before RMSNorm.