In this lesson we will bring together all of the tools we have learned in previous lessons to begin creating customized plot functions. Over the course of this exercise we will learn about R’s graphical parameters as well as a little about R as a programming language.

Functions

The most important new topic is the creation of functions. Functions in R are objects in their own right and can be passed around just like any other variable.

basic syntax

For our purposes, the simple syntax of functions is:

myfunction <- function(arg1, arg2, ... ) {

  ~statements~

  return(object)
}

Arguments to functions are named and are referenced inside the function by name. The call to the return() function is optional if your function is generating plots rather than returning a result.

Note: R uses pass-by-value rather than pass-by-reference. This means that a function cannot change the value of anything outside of its curly braces – its ‘scope’. However, a function does have read access to global variables

Our first function will capture everything we did in Lesson 02 to read in and convert output from MUSTANG web services into a nice dataframe:

# getMetricDF accepts a MUSTANG url and returns a metric dataframe
getMetricDF <- function(url, skip=1) {
 
  # Read in the data
  DF <- read.csv(url, stringsAsFactors=FALSE, skip=skip)
  
  # Convert character strings to class 'POSIXct'.
  DF$start <- as.POSIXct(DF$start, "%Y/%m/%d %H:%M:%OS", tz="GMT")
  DF$end <- as.POSIXct(DF$end, "%Y/%m/%d %H:%M:%OS", tz="GMT")
  
  # Add columns to the dataframe
  char_matrix <- stringr::str_split_fixed(DF$target,'\\.',5)
  DF$net <- as.factor(char_matrix[,1])
  DF$sta <- as.factor(char_matrix[,2])
  DF$loc <- as.factor(char_matrix[,3])
  DF$cha <- as.factor(char_matrix[,4])
  DF$qual <- as.factor(char_matrix[,5])
  
  return(DF)
}

# Try it out
url <- "http://service.iris.edu/mustang/measurements/1/query?net=IU&sta=FUNA&loc=00&cha=BHZ&timewindow=2015-01-01,2015-03-31&output=text&metric=sample_mean&orderby=start"
metricDF <- getMetricDF(url)
head(metricDF)
##      value           target      start        end
## 1 -4740440 IU.FUNA.00.BHZ.M 2015-01-01 2015-01-02
## 2 -4740420 IU.FUNA.00.BHZ.M 2015-01-02 2015-01-03
## 3 -4586580 IU.FUNA.00.BHZ.M 2015-01-03 2015-01-04
## 4 -3061230 IU.FUNA.00.BHZ.M 2015-01-04 2015-01-05
## 5 -2856150 IU.FUNA.00.BHZ.M 2015-01-05 2015-01-06
## 6 -3617690 IU.FUNA.00.BHZ.M 2015-01-06 2015-01-07
##                       lddate net  sta loc cha qual
## 1 2015/02/14 01:29:26.044289  IU FUNA  00 BHZ    M
## 2 2015/02/13 23:56:08.871530  IU FUNA  00 BHZ    M
## 3 2015/02/14 01:33:07.168904  IU FUNA  00 BHZ    M
## 4 2015/02/14 05:01:36.007208  IU FUNA  00 BHZ    M
## 5 2015/02/14 02:17:11.509053  IU FUNA  00 BHZ    M
## 6 2015/02/14 01:12:17.870723  IU FUNA  00 BHZ    M

Several features of R’s syntax are worth noting:

function(url, skip=1)

R functions accept arguments as named parameters with optional default values as in: skip=1. If no default value is specified for parameter, a default of NULL will be used.

read.csv(url, stringsAsFactors=FALSE, skip=skip)

R’s function parsing understands name=value whether the parameter name is in quotes or not. The above line could also be written:

read.csv(url, 'stringsAsFactors'=FALSE, 'skip'=skip)

first plotting function

With our dataframe creation function in hand it’s time to write our first plotting function whose sole purpose is to wrap the existing plot() function:

# Function to create a timeseries plot
myTimeseriesPlot <- function(DF, varName1, varName2, ...) {
  
  # Extract data
  x <- DF[[varName1]]
  y <- DF[[varName2]]
  
  # Sanity check that varName1 has class 'POSIXct'
  if ( !("POSIXct" %in% class(x)) ) {
    stop(paste0('Variable ',varName1,' is not POSIXct'))
  }
  
  # Sanity check that varName2 is numeric
  if ( !is.numeric(y) ) stop(paste0('Variable ',varName2,' is not numeric.'))
  
  # Create plot
  plot(x, y, ...)
  
}

# Try it out
myTimeseriesPlot(metricDF, 'start', 'value', 
                 xlab='Date', ylab='Mean',
                 main='Daily Mean Amplitude')

Here are some features to note:

function(DF, varName1, varName2, ...)

The ... argument is there to accept any additional arguments you pass in. This is particularly useful when you are wrapping a function that people may already be familiar with. Our function simply passes the unidentified arguments on with plot(x, y, ...). (See ?plot.default for a coomplete list of possible arguments.)

x <- DF[[varName1]]

This is the correct syntax for referencing a list element or dataframe column given the name of that element or column.

stop(paste0('Variable ',varName1,' is not POSIXct'))

The stop() function stops evaluation, immediately returns from the function and throws an error message that can be trapped. for more detailed disucssion of error handling please see:

myTimeseriesPlot(metricDF, 'start', 'value', 
                 xlab='Date', ylab='Mean',
                 main='Daily Mean Amplitude')

Function calls with many arguments can be broken up over several lines to make for more readable code.


Task 1: Improve upon the plot:

  • copy and paste the function above into a new .R script in RStudio named myTimeseriesPlot.R
  • read up on ?plot.default and ?par to learn about the following graphical parameters: pch, cex, col, las, xlim, ylim, xlab, ylab, main, sub, bty
  • use these parameters inside your funciton to improve the appearance of your plot.

MATLAB styling

The Mustang Databrowser uses MATLAB styling for its timeseries plots. The rest of this lesson is a script that produces a MATLAB style timeseries plot. This script is an abridged version of the various techniques that were used to stlye these plots.

As you will see, being persnickety about how things look can cause you to write a lot of code.

Your task is the following:

  • copy and paste the code into a new .R script in RStudio
  • try running it with different metric dataframes
  • study and understand what the different functions are doing
  • make changes to the script without breaking it
  • adjust parameters so that it looks good when run interactively in RStudio

Good luck!


myMatlabTimeseries <- function(DF, varName1, varName2, snclName, metricTitle, ...) {
    
  # ----- Data -----------------------------------------------------------------
  
  # Extract data
  time <- DF[[varName1]]
  metric <- DF[[varName2]]
  
  # Sanity check that varName1 has class "POSIXct"
  if ( !("POSIXct" %in% class(time)) ) {
    stop(paste0('Variable ',varName1,' is not POSIXct'))
  }
  
  # Sanity check that varName2 is numeric
  if ( !is.numeric(metric) ) stop(paste0('Variable ',varName2,' is not numeric.'))
  
  
  # ----- Style ----------------------------------------------------------------
  
  # Save current par() settings
  oldPar <- par()
  
  # Global graphical parameter settings
  par(bg='gray95', mar=c(.5,3,.5,3), oma=c(5,2,4,0))

  # Plotting parameters
  pch <- 15; col <- 'red'; cex <- 1.0
  las <- 1
  par(cex=1.0)
  
  # x axis range from data we know exists in the metric dataframe
  xlim <- range(DF$start,DF$end)
  starttime <- xlim[1]
  endtime <- xlim[2]
  
  # ----- Plot -----------------------------------------------------------------
  
  plot(time, metric,
       panel.first=rect(par("usr")[1],par("usr")[3],par("usr")[2],par("usr")[4], col="white"),
       xlim=xlim, bty='n',
       pch=pch, col=col, cex=cex,
       axes=FALSE, xlab="", ylab="")

  # Draw a box
  box()

  # Create x axis ticks and labels
  dayCount <- as.numeric( difftime(endtime,starttime,units="days") )
  
  if (dayCount < 14) {     # < 2 weeks with ticks on days    
    xAxTicks <- seq(starttime,endtime,by="day")
    xAxLabels <- format(xAxTicks,"%d-%b",tz="GMT")
  } else if (dayCount >= 14 && dayCount < 42) {   # < 1 month with ticks every week
    xAxTicks <- seq(starttime,endtime,by="week")
    xAxLabels <- format(xAxTicks,"%d-%b",tz="GMT")
  } else if (dayCount >= 42 && dayCount < 95) {   # < 3 months with ticks every 2-weeks
    xAxTicks <- seq(starttime,endtime,by="2 weeks")
    xAxLabels <- format(xAxTicks,"%d-%b",tz="GMT")
  } else { # > 3 months with ticks on ~ month boundaries
    xAxTicks <- seq(starttime,endtime,by="month")
    xAxLabels <- format(xAxTicks,"%b-%Y",tz="GMT")    
  }  
  
  # Create y axis ticks and labels
  yAxTicks <- axTicks(2)
  if (max(yAxTicks) >= 1e5) {
    yAxLabels <- format(yAxTicks,scientific=TRUE,digits=2)
  } else {
    yAxLabels <- yAxTicks
  }
    
  # X axis and grid
  axis.POSIXct(1, at=xAxTicks, labels=xAxLabels, line=0)
  abline(v=xAxTicks, col="lightgray", lty="dotted", lwd=par("lwd"))
  
  # Y axis and grid
  axis(side=2, las=1, at=yAxTicks, labels=yAxLabels)
  abline(h=yAxTicks, col="lightgray", lty="dotted", lwd=par("lwd"))
  
  # ----- Annnotation ----------------------------------------------------------
  
  # Title at the top
  text <- paste(snclName,'--',metricTitle)
  title(text, outer=TRUE)
    
  # Dates at the bottom
  dataDateRange <- paste(format(starttime,"%b %d, %Y",tz="GMT"),"-",format(endtime,"%b %d, %Y",tz="GMT"))
  dataJulianDateRange <- paste(format(starttime,"%Y.%j",tz="GMT"),"-",format(endtime,"%Y.%j",tz="GMT"))
  dataRange <- paste("Data for",dataDateRange,'(',dataJulianDateRange,')')

  line <- par('oma')[1] - 1.5  # 1.5 lines off the outer margin bottom
  mtext(dataRange, side=1, line=line, cex=1.2)
  
  # Restore old par settings
  par(oldPar)
  
}

# Try it out
myMatlabTimeseries(metricDF, 'start', 'value', 
                   snclName='IU.FUNA.00.BHZ',
                   metricTitle='Daily Mean Amplitude')


Strings and Dates < prev | next > ‘seismic’ Packages