Measure of Central Tendency
Author: Ernad Mujakic
Date: 2025-07-06
A measure of central tendency is a statistical measure that attempts to describe the center of a dataset. The measure attempts to summarize the dataset with a single value that represents the middle or "average" of the data. The most common measures are the Mean, Median, and Mode. Depending on the characteristics of the underlying dataset, one measure may be more appropriate than the others.
The mean, often referred to as the average, is one of the most widely used measures of central tendency. There are various types of means, with the arithmetic mean being the most common. The arithmetic mean is calculated by summing all values in a dataset and dividing by the number of values.
The sample mean, denoted as
Where
Properties
- The mean is sensitive to outliers, so it may not provide an accurate representation of the center when the underlying dataset is asymmetric, or has many extreme outliers.
- For symmetric distributions, the mean is a useful measure that provides the average value of the dataset.
- The mean is often used in algorithms such as linear regression, where it helps minimize error in predictions.
The median is the middle value of a dataset when it is ordered. If there is an even number of values, then the mean (average) of the two middlemost values are taken.
If
While if
Properties
- The median is less susceptible to outliers than the mean, therefore, it provides a more accurate measure of the center for skewed distributions.
- In data science and machine learning, the mode is useful for categorical variables, such as determining the most common class label in classification tasks.
The mode of a dataset is the value that appears most frequently. A dataset can have:
- One mode (unimodal)
- Two modes (bimodal)
- Multiple modes (multimodal)
The mode is useful for analyzing Nominal Data, as it helps identify the most "popular" category within a given set of values.
The mode can be defined as:
Where
Properties
- Unlike the mean or median, the mode can be directly applied to Categorical Data, making it one of the most simple and versatile measures of central tendency.
Summary of When to Use Each Measure
Type of Variable | Best Measure of Central Tendency |
---|---|
Nominal Data | Mode |
Ordinal Data | Median |
Interval/Ratio (not skewed) | Mean |
Interval/Ratio (skewed) | Median |
References
- J. Han and M. Kamber, Data Mining : Concepts and Techniques, 3rd ed. Amsterdam ; Boston: Elsevier/Morgan Kaufmann, 2012.
- Laerd Statistics, “Measures of central tendency,” Laerd Statistics, 2018. https://statistics.laerd.com/statistical-guides/measures-central-tendency-mean-mode-median.php
- “Central tendency,” Wikipedia, Jul. 13, 2020. https://en.wikipedia.org/wiki/Central_tendency