This content originally appeared on DEV Community and was authored by Chanchal Singh
Welcome back to the Statistics Challenge for Data Scientists!
Today, we’re learning something that makes our data fair — Normalization.
What is Normalization?
Imagine you and your friend are running a race.
- You run 100 meters
- Your friend runs 1 kilometer (1000 meters)
Can we directly compare who runs faster?
Not really — because the units and scales are different.
That’s exactly what happens with data — some numbers are small (like age), and some are huge (like salary).
Normalization means scaling data so that all values fit into a similar range and can be compared fairly.
Why Do We Need Normalization?
Think of a teacher giving marks to students:
- Math score: 100 marks
- Science score: 50 marks
If we add them directly, Math will dominate because its maximum is higher.
To treat both subjects fairly, we scale the marks — that’s normalization.
In data science, normalization helps machine learning models:
- Work faster
- Learn better
- Give fair importance to each feature
Two Popular Normalization Methods
Let’s understand the two most common types — Min-Max Normalization and Z-Score Normalization.
1. Min-Max Normalization (Feature Scaling)
It squeezes all data values between 0 and 1.
Formula:
X' = (X - Xmin) / (Xmax - Xmin)
Example:
Let’s say we have ages: 10, 20, 30, 40, 50.
- Minimum = 10
- Maximum = 50
For age = 30
X' = (30 - 10) / (50 - 10) = 20 / 40 = 0.5
So, the normalized value is 0.5.
When to Use:
- When your data has a fixed range (like 0 to 100 marks).
- Best for algorithms that depend on distance (like KNN, K-Means, Neural Networks).
2. Z-Score Normalization (Standardization)
This method centers the data around mean = 0 and standard deviation = 1.
It shows how far each value is from the average.
Formula:
Z = (X - μ) / σ
Where:
- μ = Mean of the data
- σ = Standard deviation
Example:
Let’s say heights (in cm): 150, 160, 170, 180, 190
- Mean (μ) = 170
- Standard deviation (σ) = 14.14
For height = 150
Z = (150 - 170) / 14.14 = -1.41
So, 150 cm is 1.41 standard deviations below the mean.
When to Use:
- When data doesn’t have a fixed range.
- Works well with algorithms assuming normal distribution (like Linear Regression, Logistic Regression, PCA).
Min-Max vs Z-Score — Quick Comparison
| Feature | Min-Max Normalization | Z-Score Normalization |
|---|---|---|
| Range | 0 to 1 | Can be negative or positive |
| Depends on | Min & Max values | Mean & Standard Deviation |
| Sensitive to outliers | Yes | Less sensitive |
| Best for | Bounded data (e.g. exam scores) | Unbounded data (e.g. height, salary) |
Summary
- Normalization makes data fair by bringing all features to a similar scale.
- Use Min-Max when data has clear limits (like percentages).
- Use Z-Score when data spreads freely and you care about distance from average.
Quick Recap Example
| Original Value | Min-Max (0-1) | Z-Score |
|---|---|---|
| 10 | 0.0 | -1.41 |
| 30 | 0.5 | 0.0 |
| 50 | 1.0 | +1.41 |
In short:
Normalization is like giving everyone the same playing field so that your machine learning model doesn’t play favorites!
I love breaking down complex topics into simple, easy-to-understand explanations so everyone can follow along. If you're into learning AI in a beginner-friendly way, make sure to follow for more!
Connect on Linkedin: https://www.linkedin.com/in/chanchalsingh22/
Connect on YouTube: https://www.youtube.com/@Brains_Behind_Bots
This content originally appeared on DEV Community and was authored by Chanchal Singh
Chanchal Singh | Sciencx (2025-11-15T09:30:00+00:00) Statistics Day 4: Z-Score vs Min-Max Normalization — Making Data Fair for ML Models. Retrieved from https://www.scien.cx/2025/11/15/statistics-day-4-z-score-vs-min-max-normalization-making-data-fair-for-ml-models/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.

