An R Introduction to Statistics

Cumulative Relative Frequency Graph

A cumulative relative frequency graph of a quantitative variable is a curve graphically showing the cumulative relative frequency distribution.

Example

In the data set faithful, a point in the cumulative relative frequency graph of the eruptions variable shows the frequency proportion of eruptions whose durations are less than or equal to a given level.

Problem

Find the cumulative relative frequency graph of the eruption durations in faithful.

Solution

We first find the cumulative relative frequency distribution of the eruption durations as follows. Check the previous tutorial on Cumulative Relative Frequency Distribution for details.

> duration = faithful$eruptions 
> breaks = seq(1.5, 5.5, by=0.5) 
> duration.cut = cut(duration, breaks, right=FALSE) 
> duration.freq = table(duration.cut) 
> duration.cumfreq = cumsum(duration.freq) 
> duration.cumrelfreq = duration.cumfreq / nrow(faithful)

We then plot it along with the starting zero element.

> cumrelfreq0 = c(0, duration.cumrelfreq) 
> plot(breaks, cumrelfreq0, 
+   main="Old Faithful Eruptions",  # main title 
+   xlab="Duration minutes", 
+   ylab="Cumulative eruption proportion") 
> lines(breaks, cumrelfreq0)        # join the points

Answer

The cumulative relative frequency graph of the eruption duration is:

PIC

Alternative Solution

We create an interpolate function Fn with the built-in function ecdf. Then we plot Fn right away. There is no need to compute the cumulative frequency distribution a priori.

> Fn = ecdf(duration) 
> plot(Fn, 
+   main="Old Faithful Eruptions", 
+   xlab="Duration minutes", 
+   ylab="Cumulative eruption proportion")

PIC

Exercise

Find the cumulative relative frequency graph of the eruption waiting periods in faithful.