Cumulative Frequency Distribution
The cumulative frequency distribution of a quantitative variable is a summary of data frequency below a given level.
Example
In the data set faithful, the cumulative frequency distribution of the eruptions variable shows the total number of eruptions whose durations are less than or equal to a set of chosen levels.
Problem
Find the cumulative frequency distribution of the eruption durations in faithful.
Solution
We first find the frequency distribution of the eruption durations as follows. Further details can be found in the Frequency Distribution tutorial.
> breaks = seq(1.5, 5.5, by=0.5)
> duration.cut = cut(duration, breaks, right=FALSE)
> duration.freq = table(duration.cut)
We then apply the cumsum function to compute the cumulative frequency distribution.
Answer
The cumulative distribution of the eruption duration is:
[1.5,2) [2,2.5) [2.5,3) [3,3.5) [3.5,4) [4,4.5) [4.5,5)
51 92 97 104 134 207 268
[5,5.5)
272
Enhanced Solution
We apply the cbind function to print the result in column format.
duration.cumfreq
[1.5,2) 51
[2,2.5) 92
[2.5,3) 97
[3,3.5) 104
[3.5,4) 134
[4,4.5) 207
[4.5,5) 268
[5,5.5) 272
Exercise
Find the cumulative frequency distribution of the eruption waiting periods in faithful.