As part of the MUSTANG project, three R packages were developed whose main purpose is to generate summary data quality metrics from raw seismic data. Creation of these metrics required the development of a set of underlying object classes and methods in R that are tailored to the needs of seismic data. The base classes and methods in the seismic package are inspired by the Python code found in the ObsPy python toolbox. Additional classes and methods support data provided by IRIS DMC webservices.

Three separate packages are currently provided, presented here in dependency order:

A brief overview of each package will be given along with an explanation of R’s S4 class system for object oriented programming.

‘seismicRoll’ Package

You can install the seismicRoll package with the ‘Packages’ tab in RStudio. Simply click on the ‘Install’ button, type ‘seismicRoll’ as the package name and click ‘Install’. This will install all dependencies and the package itself.

Once installed you can click on the package in the ‘Packages’ tab to see associated documentation.

The provided ‘rolling’ functions all invoke C++ code that is significantly faster than the equivalent code written in R. These are low level functions and of interest mostly to those who will be creating new metrics.


Task 1: seismicRoll functions

Explore the seismicRoll functions by running the associated examples and experimenting on your own:


‘IRISSeismic’ Package

The IRISSeismic package contains all of the core classes and methods for working with seismic data obtained from IRIS web services. The following classes are defined:

You can install the IRISSeismic package with the ‘Packages’ tab in RStudio. Simply click on the ‘Install’ button, type ‘IRISSeismic’ as the package name and click ‘Install’. This will install all dependencies and the package itself.

Installation of packages from local source files can still be done from RStudio. Click ‘Packages > Install’ and then choose ‘Install from: Package archive’ and browse to the location of the downloaded package.

S3 Classes

Warning: This section may be of more interest to programmers than seismologists.

Before describing the functionality of the IRISSeismic package students need to become aquainted with R’s S4 class system. And in order to understand S4 you probably need to understand a little bit about R’s support for different object oriented classes. Those interested in a more in-depth discussion of R’s object oriented menagerie should look at Hadlely Wickham’s OO field guide for a discussion of S3, S4 and RC classes as well as Winston Chang’s new R6 classes.

As a functional language, early verions of R had very shallow support for classes. The S3 class system is very simple to use and is still seen in many R packages. It functions by assigning a class attribute to an R data type. In the following example the histogram() function returns an S3 object which is mostly just a list with an an extra ‘class’ attribute whose value is ‘histogram’. When a polymorphic function is passed an argument, the class of that argument is checked and the appropriate function is called. This allows plot(h) to dispatch to plot.histogram(h).

h <- hist(rnorm(1e4), plot=FALSE)
class(h)
## [1] "histogram"
typeof(h)
## [1] "list"
attributes(h)
## $names
## [1] "breaks"   "counts"   "density"  "mids"     "xname"    "equidist"
## 
## $class
## [1] "histogram"
methods(plot)
##  [1] plot.acf*           plot.data.frame*    plot.decomposed.ts*
##  [4] plot.default        plot.dendrogram*    plot.density*      
##  [7] plot.ecdf           plot.factor*        plot.formula*      
## [10] plot.function       plot.hclust*        plot.histogram*    
## [13] plot.HoltWinters*   plot.isoreg*        plot.lm*           
## [16] plot.medpolish*     plot.mlm*           plot.ppr*          
## [19] plot.prcomp*        plot.princomp*      plot.profile.nls*  
## [22] plot.spec*          plot.stepfun        plot.stl*          
## [25] plot.table*         plot.ts             plot.tskernel*     
## [28] plot.TukeyHSD*     
## 
##    Non-visible functions are asterisked
plot(h)

S4 Classes

As R evolved, another, more object oriented class system appeared. This system is called S4 and uses the concept of slots as named components of an object. With S4 classes, methods are much more tightly bound to the particular class they operate on. Components of an S4 class are listed with the slotNames() function and can be accessed using the slot() function or the @ accessor. Here is an example from the IRISSeismic package:

library(IRISSeismic)

iris <- new("IrisClient")
class(iris)
## [1] "IrisClient"
## attr(,"package")
## [1] "IRISSeismic"
typeof(iris)
## [1] "S4"
slotNames(iris)
## [1] "site"      "debug"     "useragent"
slot(iris,'site') # using 'slot$site' would throw a warning
## [1] "http://service.iris.edu"
iris@site
## [1] "http://service.iris.edu"

Stream, Trace and TraceHeader

The IrisClient class we introduced above is responsible for obtaining data from IRIS web services. Most of it’s methods are get~ methods that return dataframes containing the result of web service requests. The getDataselect() function, however, returns raw seismic data as a new Stream object. The structure of a Stream object is made visible in the following code:

# Request seismic data
starttime <- as.POSIXct("2002-04-20", tz="GMT")
endtime <- as.POSIXct("2002-04-21", tz="GMT")
st <- getDataselect(iris,"US","OXF","","BHZ",starttime,endtime)

# 'Stream' object
slotNames(st)
## [1] "url"                "requestedStarttime" "requestedEndtime"  
## [4] "act_flags"          "io_flags"           "dq_flags"          
## [7] "timing_qual"        "traces"
for (s in slotNames(st)) { print(paste0(s,' : ',class(slot(st,s)))) }
## [1] "url : character"
## [1] "requestedStarttime : POSIXct" "requestedStarttime : POSIXt" 
## [1] "requestedEndtime : POSIXct" "requestedEndtime : POSIXt" 
## [1] "act_flags : integer"
## [1] "io_flags : integer"
## [1] "dq_flags : integer"
## [1] "timing_qual : numeric"
## [1] "traces : list"
st@url
## [1] "http://service.iris.edu/fdsnws/dataselect/1/query?net=US&sta=OXF&loc=--&cha=BHZ&start=2002-04-20T00:00:00&end=2002-04-21T00:00:00&quality=B"
length(st@traces)
## [1] 5
# Pull out the first Trace using '[[ ]]' syntax 
tr <- st@traces[[1]]
class(tr)
## [1] "Trace"
## attr(,"package")
## [1] "IRISSeismic"
# 'Trace' object (what libmseed calls a "Trace Segment")
slotNames(tr)
## [1] "id"                    "stats"                 "Sensor"               
## [4] "InstrumentSensitivity" "InputUnits"            "data"
for (s in slotNames(tr)) { print(paste0(s,': ',class(slot(tr,s)),', length=',length(slot(tr,s)))) }
## [1] "id: character, length=1"
## [1] "stats: TraceHeader, length=1"
## [1] "Sensor: character, length=1"
## [1] "InstrumentSensitivity: numeric, length=1"
## [1] "InputUnits: character, length=1"
## [1] "data: numeric, length=10733"
# 'TraceHeader' object
print(tr@stats)
## Seismic Trace TraceHeader 
##  Network:        US 
##  Station:        OXF 
##  Location:        
##  Channel:        BHZ 
##  Quality:        M 
##  calib:          1 
##  npts:           10733 
##  sampling rate:  40 
##  delta:          0.025 
##  starttime:      2002-04-20 00:00:00 
##  endtime:        2002-04-20 00:04:28 
##  processing:

So we see that each Stream object consists of a list of Trace objects, each of which is guaranteed to be a continuous timeseries. Metadata associated with each Trace is kept in a TraceHeader object found in the @stats slot of the Trace.

WARNING: Trace objects can have millions of points that will bog down R if you try to plot all of them. DON’T DO THAT! The next lesson will cover optimized methods for working with lengthy Trace data.


Task 2: S4 objects

Use your knowledge of S4 accessor methods to investigate the different slots found in the the four classes defined in the seismic package: IrisClient, Stream, Trace and TraceHeader. The following questions can help get you started.

  • Why do Stream objects keep track of requestedStarttime and requestedEndtime?
  • Where can you find information about the instrument used to collect data?
  • How could you determine the duration and size of gaps between adjacent Traces?

‘IRISMustangMetrics’ Package

The IRISMustangMetrics package defines a few more S4 classes related to the structure of several types of metrics used in MUSTANG:

Each of these classes has slots containing all the information needed to create a metric for inclustion in the MUSTANG database.

Additional functions defined in this package provide methods to obtain results from the MUSTANG database as well as algorithms for some of the more detailed metrics. Many of the simpler metrics functions in this package only need to assemble results available in the IRISSeismic package into one of the metric classes in preparation for submission to MUSTANG.

You can install the IRISMustangMetrics package with the ‘Packages’ tab in RStudio. Simply click on the ‘Install’ button, type ‘seismicRoll’ as the package name and click ‘Install’. This will install all dependencies and the package itself.


Task 3: IRIS web services

Read the IRISSeismic package documentation for IrisClient-class? to learn about the different web service requests that can be made. Use the folllowing methods to obtain dataframe results and then investigate those dataframes with R.


Plot Functions < prev | next > Seismic Traces