In lesson 09 we will go through the steps required to create an entirely new SingleValueMetric
including various sanity checks and error reporting required in a robust metric algorithm.
R has many packages (6,000+) that are often written by experts in some field of science statistics. By utilizing these packages we get to benefit from the hard work of others. The R website hosts a set of CRAN Task Views, each maintained by a recognized expert, that describe the different packages available in different fields of inquiry. For basic numerical methods, the Task View on Numerical Mathematics describes packages that deal with Linear Algebra, Special Functions, Polynomials, Differentiation and Integration, etc. There are even packages with matlab functions that have been ported to R. The IRISSeismic package already utilizes functions from the pracma and signal packages mentioned in this Task View.
There is also a Task View for Time Series Analysis that may have useful packages for analyzing seismic signals.
Note 1: Many of the packages in Time Series Analysis are not tailored to the needs of high-resolution signal processing. Instead, they often focus on statistically robust analysis of timeseries data with many fewer data points.
Note 2: The RSEIS package by Jonathan Lees defines a suite of functionality for working with seismic data but also creates an entire interactive environment. While this package may contain source code defining useful functions, the package functions are too tightly bound to the RSEIS internal data structure to be useful in the context of the seismic~ packages.
To create a new metric we first need to come up with something we wish to measure. One of they many ways in which a seismic channel can go bad is when the generated signal contains mostly white noise. It would be useful to spot check seismic data for white noise, perhaps multiple times per day and come up with a metric that measures the likelihood that white noise occcurred on that day.
Doing a simple search for “white noise” on the Time Series Analysis Task View we see:
Tests of white noise using wavelets are provided by hwwntest.
Quickly perusing the package manual we see that this package references a paper published in December 2014 in the journal Stat: White noise testing using wavelets. From the abstract:
Testing whether a time series is consistent with white noise is an important task within time series analysis and for model fitting and criticism via residual diagnostics. We introduce three fast and efficient white noise tests that assess spectral constancy via the wavelet coefficients of a periodogram. The Haar wavelet white noise test derives the exact distribution of the Haar wavelet coefficients of the asymptotic periodogram under mild conditions. The single-coefficient white noise test uses a single Haar wavelet coefficient obtaining a test statistic as a linear combination of odd-indexed autocorrelations. The general wavelet white noise test uses compactly supported Daubechies wavelets, shows that its coefficients are asymptotically normal and derives its theoretical power for an arbitrary spectrum. All our tests are available in the freely available hwwntest package for the R system. We present a comprehensive simulation study that shows the good performance of our new tests against alternatives commonly found in available software and show an example applied to a wind power time series. © 2014 The Authors. Stat published by John Wiley & Sons Ltd.
Sounds promising!
After installing the hwwntest package with RStudio we can review available functions in the ‘Help’ tab.
We will short circuit the learning process and suggest that we use the genwwn.test()
function. Here is a quick example to show what it does.
library(hwwntest)
# Create two time series, one white noise and one generated by an ARIMA model
# NOTE: time series must have a length that is a power of 2
white <- rnorm(1024)
model <- arima.sim(n=1024, model=list(ar=0.8))
mixed <- white + 0.2*model
# Try out the test
genwwn.test(white)
##
## Wavelet Test of White Noise
##
## data:
## p-value = 0.688
genwwn.test(model)
##
## Wavelet Test of White Noise
##
## data:
## p-value < 2.2e-16
genwwn.test(mixed)
##
## Wavelet Test of White Noise
##
## data:
## p-value = 0.1009
# What does genwwn.test return?
whiteTest <- genwwn.test(white)
names(whiteTest)
## [1] "p.val.collector" "p.val.adjust" "p.value" "method"
# Capture the p-values
whitePValue <- genwwn.test(white)$p.value
modelPValue <- genwwn.test(model)$p.value
mixedPValue <- genwwn.test(mixed)$p.value
# Set up a 'stacked' plot
layout(matrix(seq(3)))
par(mar=c(3,4,4,2)+0.1)
# Plot data
plot(white,type='p',las=1)
title(paste0('White Noise, p-value=',sprintf("%g",whitePValue)),line=2.5)
mtext('NEAR CERTAINTY this signal contains white noise',side=3,line=0.5,col='black')
plot(model,type='p',las=1)
title(paste0('Model, p-value=',sprintf("%g",modelPValue)),line=2.5)
mtext('NEAR CERTAINTY this signal does NOT contain white noise',side=3,line=.5,col='black')
plot(mixed,type='p',las=1)
title(paste0('White + 0.2*Model, p-value=',sprintf("%g",mixedPValue)),line=2.5)
mtext('GOOD CHANCE this contains white noise',side=3,line=.5,col='black')
# Restore single plot layout
par(mar=c(5,4,4,2)+0.1)
layout(1)
Before we start coding, we need to design our metric function. All metric functions that define a SingleValueMetric
should accept a Stream
object as their first argument. Typically, the function will be called with an entire day’s worth of data for a particular SNCL. After the Stream
object, additional arguments may be passed in that allow us to modify the calculation of our metric.
Here is some pseudo-code that will describe our basic plan:
.bincode()
SingleValueMetric
Task 1: Create white noise metric
(st, sampleLength=1024, incrementSecs=3600, breaks=c(0,1e-12,1e-10,1e-8,1e-6,1e-5,1e-4,1e-3,1e-2,1e-1,1))
seismicMetrics::DCOffsetTimeseriesMetric
and seismicMetrics::basicStatsMetric
as starting points for creating your functionst <- getSNCL(iris,"IU.GRFO..BHZ", as.POSIXct("2011-05-18",tz="GMT"), as.POSIXct("2011-05-19",tz="GMT"))
to test your functionTask 2: Examine p-value codes
Task 3: Collaborative Improvement of the white noise metric
ggplot2 < prev | next > Modifying ‘seismic’ Packages