An R Introduction to Statistics

Skewness

The skewness of a data population is defined by the following formula, where μ2 and μ3 are the second and third central moments.

γ1 = μ3∕μ3∕22

Intuitively, the skewness is a measure of symmetry. As a rule, negative skewness indicates that the mean of the data values is less than the median, and the data distribution is left-skewed. Positive skewness would indicate that the mean of the data values is larger than the median, and the data distribution is right-skewed.

Problem

Find the skewness of eruption duration in the data set faithful.

Solution

We apply the function skewness from the e1071 package to compute the skewness coefficient of eruptions. As the package is not in the core R library, it has to be installed and loaded into the R workspace.

> library(e1071)                    # load e1071 
> duration = faithful$eruptions     # eruption durations 
> skewness(duration)                # apply the skewness function 
[1] -0.41355

Answer

The skewness of eruption duration is -0.41355. It indicates that the eruption duration distribution is skewed towards the left.

Exercise

Find the skewness of eruption waiting period in faithful.