What Are Summary Statistics?
Summary statistics describe a dataset with a few key numbers. Instead of reading all 100 values, you can say "the average is 85" and instantly understand the data.
scores = [85, 92, 78, 95, 88]
print("Min:", min(scores)) # 78
print("Max:", max(scores)) # 95
print("Sum:", sum(scores)) # 438
print("Count:", len(scores)) # 5
print("Mean:", sum(scores) / len(scores)) # 87.6
💡
Mean vs. Median
The mean (average) is affected by extreme values (outliers). The median (middle value when sorted) is more robust. Always check both!
The statistics Module
Python's built-in statistics module gives you median and mode without writing the formulas yourself:
import statistics
scores = [85, 92, 78, 95, 88, 85]
print("Median:", statistics.median(scores)) # 86.5
print("Mode:", statistics.mode(scores)) # 85
Range and Spread
The range tells you how spread out the data is — the difference between the highest and lowest values:
data_range = max(scores) - min(scores)
print("Range:", data_range)
🆕
Outlier Effect
If a student scores 20 in a class where everyone else scored 90, the mean drops to ~83 but the median stays near 90. That's why data scientists always look at both!