This week, I had my first minor “Ah-ha!” moment while struggling on the same issue in homework 3 that I had on a previous problem in homework 2. For many of you, I’m sure this is a “duh”, but for me it was a brilliant new duh that will now always be part of my R vocabulary.

The block came when we’re given data with a variable (*height*) and the frequency (x*)* of that variable in a sample size (*n*) and have to calculate the median, etc., of the variable. I went around in circles trying to find a way to do something in R without having to manually calculate: i.e. for the median, find the n/2 value of the variable. I tried plotting the data, multiplying, and just staring at the screen for a while (which no matter how often I think this will help, it really doesn’t), to no avail. Finally looking through the notes again, a great beam of light came down and shined on the rep() function, and alas, order came back into life. If we put the two columns into rep(), we can create a vector where the variable is repeated by the frequency:

vector <- rep(height, x)

Now, we can then do all the wonderful statistic functions on this sample vector without any issues. This was a thrilling, happy-dance-worthy moment that lasted for the duration of that homework question. I hope by sharing my moment, it could entice other happy-dances, or at least remind you of one of your own beginning accomplishments in the great world of learning R.

### Like this:

Like Loading...

*Related*

I would like to share your happy-dance-moment because I also had one when I figured out the same rep function you did. It is tempting to just use the median function on the height column of the data, but this will provide an incorrect median value. Even the mean function will not take into account the correct sample size. I had trouble with this too until I discovered the rep function, which expanded the data out, listing every data point.

I also had a similar moment when I discovered the suite of apply functions for homework 3.