Using R — A Script Introduction to R

This entry is part 3 of 21 in the series Using R

To many people, R is like the Everglades. They’ve heard of it, they know it’s big and has amazing treasures deep inside. Articles in the media can make it look irresistible. But after a personal or even second hand experience people also learn that R can be a big swamp where you are all but guaranteed to get soggy boots and mosquito bites before you’re done. And there is always the distinct possibility of getting lost and falling into a ‘gator hole’. Indeed, if you go in without a guide, hoping to get in and out quickly you probably won’t enjoy it much. This post contains a script that shows you some of the sights without letting you fall in.  If you like to learn by example, read on.

The rest of this post is the verbatim script with graphics embedded in the appropriate places. You can also download the script and run it yourself.  The comments in this script capture a session of working with and thinking about a dataset.  This script doesn’t try to cover everything. On the contrary, it pedantically reuses as few techniques as possible to show that you can do a lot with a little.

This script also demonstrates how to be systematic with respect to commenting, variable naming, setting graphical parameters, etc.  One of the keys to working successfully with R is writing scripts that explain what they are doing and contain consistent, readable, verging-on-predictable code.

I look forward to any suggestions for corrections/improvements.

OK, perhaps we shouldn’t expect a job offer from the Florida Department of Tourism any time soon.  But I hope this short tour through the R swamp shed some light on how just a few techniques can help you begin interrogating large datasets and telling interesting  stores.

Happy Exploring!

Series NavigationUsing R — Installing PackagesUsing R — Basic error Handing with tryCatch()
This entry was posted in R and tagged , . Bookmark the permalink.

4 Responses to Using R — A Script Introduction to R

  1. Wei says:

    Recommend to add



    saveAsPng <- TRUE

    Great post!!!

  2. bk says:

    Awesome post. Thanks..

  3. Curlew says:

    Amazing and very detailed post!
    Two comments:
    Why create those lot of mask_variables? All you do is indexing and it is a lot easier in R by using the command “which” as in ‘gator_data[which(gator=”text”)]’. But i guess this is just personal taste.
    Furthermore i didn’t know about the ‘par(new=T)’ command, but you could also add the parameter ‘add=T’ to the plot command.

    • Jonathan Callahan says:

      Thanks for the suggestions.

      Regarding which()

      Yes, we could have created a set of indices to plot “Tree Islands” with the following:

      I prefer masks because of the possibility of generating SQL type queries from these masks:

      This kind of bit-masking is ridiculously fast and, once you’ve defined a bunch of masks, allows you to be very thorough in your interrogation of the data. You can’t do that sort of thing with disjoint sets of indices.

      Regarding add=TRUE

      I believe I’ve encountered cases where this doesn’t work but I’m not sure what they were. Using par(new=TRUE) always seems to work.