Dealing with Skewness
3. Understanding Those Lopsided Distributions
Okay, so what happens if your box plot isn't symmetrical? That's where skewness comes in. Skewness refers to the asymmetry of a distribution. If the data is skewed, it means it's more concentrated on one side of the median than the other. We want to tell how to tell if a box plot is normally distributed, so it's important to know when it's not normal.
If the median is closer to the bottom of the box, and the whisker is longer on the higher end, that's a sign of a right skew (also known as positive skew). This means there are more smaller values than larger values, pulling the median downward. Think of income distribution — most people earn a moderate income, while a smaller number earn very high incomes, creating a right skew.
Conversely, if the median is closer to the top of the box, and the whisker is longer on the lower end, you've got a left skew (or negative skew). In this case, there are more larger values than smaller values, pulling the median upward. Think of age at death — most people live to a relatively old age, while fewer people die young, resulting in a left skew.
Spotting skewness in a box plot is a valuable skill. It tells you that the data isn't evenly distributed around the mean, which can impact your analysis and interpretations. If you suspect skewness, it's worth exploring other visualization techniques, like histograms, to get a clearer picture of the data's shape.