The following class materials on using R in seismology were developed by Mazama Science for the IRIS Data Management Center in Seattle, Washington.
The complete class is available at this location:
http://mazamascience.com/Classes/IRIS_2015/
The IRIS DMC archives and distributes data to support the seismological research community. The class described here introduces DMC and other seismologists to the R statistical programmig language and its use with seismological data availabe from DMC web services. The capabilities of the seismicRoll, IRISSeismic and IRISMustangMetrics packages developed as part of the MUSTANG project will be demonstrated.
Permission has been granted to release these class materials to the public in the interest of encouraging seismologists to experiment with R for their daily work.
Class materials are broken up into nine separate lessons that assume some experience coding but not necessarily any familiarity with R. Lessons are presented in sequential order and assume the student already has R and RStudio installed on their computer. Autodidacts new to R should take about 20-30 hrs to complete the course. The target audience for these materials consists of IRIS DMC employees or graduate students with a degree in the natural sciences and some experience using scientific software such as MATLAB or Python.
Prerequisites:
Three R packages are required for later lessons and, if automatic installation doesn’t work, can be downloaded from the following CRAN links:
This list of lessons includes:
The first lesson serves as an introduction to fundamental programming concepts in R: functions, operators, vectorized data and data structures (vector, list, matrix, dataframe). By the end of the first lesson, students should be able to open and plot simple data frames and access help documents and source code associated with R functions.
Lesson 02 focuses on data frames and uses publicly available metrics data from the MUSTANG database. This lesson includes a discussion of factors and describes several basic plot types: bar and pie plots, histograms, scatter plots and box plots.
Lesson 03 covers R data types and recommended packages for working with vectors of character strings and dates: the stringr package for strings and the lubridate package for dates.
Lesson 04 brings together skills learned in previous lessons to begin creating customized plot functions. Over the course of this exercise students will learn about R’s graphical parameters as well as a little about using R as a programming language.
Lesson 05 describes the IRISSeismic, seismicRoll and IRISMustangMetrics packages for working with IRIS DMC seismic data and metrics. The object oriented S4 classes, core to the IRISSeismic package, are introduced along with basic functions to obtain and plot seismic traces.
Lesson 06 goes into more detail about the methods and plotting functions available for working with raw seismic signals in the Trace
and Stream
classes. Examples focus on data exploration techniques for working with raw seismic data and metadata.
Lesson 07 introduces functions defined in the IRISMustangMetrics package for creating metrics objects and for working with metrics data obtained from the MUSTANG metrics database.
Lesson 08 is a very brief introduction to Hadley Wickham’s ggplot2 package.
Less 09 walks through an example which creates a new metric from raw seismic data and prepares it for submission to MUSTANG.
Lesson 10 describes the steps that should be taken by DMC personnel whenever changes need to be made to any of the seismic packages. Steps include code documentation, functional testing, package compilation and submission to CRAN.
We hope these lessons encourage seismologists working with IRIS DMC web services to take a look at R and experiment with it for a variety of data management, analysis and visualization needs. R does have a steep learning curve but, once mastered, provides users with an extremely powerful and customizable tool for all sorts of analysis.
Best of Luck Learning R!