Normalization Vs Standardization
Normalization typically means rescales the values into a range of [0,1].
Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).
Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).
MIN-MAX NORMALIZATION
Min-max normalization is one of the most common ways to normalize data. For every feature, the minimum value of that feature gets transformed into a 0, the maximum value gets transformed into a 1, and every other value gets transformed into a decimal between 0 and 1.
Min-max normalization has one fairly significant downside: it does not handle outliers very well.
Z-SCORE NORMALIZATION/STANDARDIZATION
Z-score normalization is a strategy of normalizing data that avoids this outlier issue.
Min-max normalization: Guarantees all features will have the exact same scale but does not handle outliers well.
Z-score normalization: Handles outliers, but does not produce normalized data with the exact same scale.
- Normalization is good to use when you know that the distribution of your data does not follow a Gaussian distribution. This can be useful in algorithms that do not assume any distribution of the data like K-Nearest Neighbors and Neural Networks.
- Standardization, on the other hand, can be helpful in cases where the data follows a Gaussian distribution. However, this does not have to be necessarily true. Also, unlike normalization, standardization does not have a bounding range. So, even if you have outliers in your data, they will not be affected by standardization.

Comments
Post a Comment