Using R — Standalone Scripts & Error Messages

This entry is part 8 of 21 in the series Using R

Open-source R is an amazing tool for statistical analysis and data visualization. Serious R gurus have found ways to do just about anything entirely within the R environment. Nevertheless, there are many of us who wish to plug R into larger, multi-language frameworks where business logic will be handled by another language and R will be primarily responsible for analysis.  This can be an excellent division of labour but requires that you first get a handle on R’s warnings and errors and how they are passed upstream.

Rscript

The easiest way to  encapsulate blocks of functionality for use by other programs is to create Unix-style, independent executables that can be invoked by other programs. (There are also many packages that allow R to be called interactively from within another language but they are beyond the scope of this post.) When writing standalone scripts for R you should use the Rscript executable that is distributed with R. You can learn more with Rscript --help or read the documentation. Rscript has the following advantages over command-line R:

  • It is an executable and takes a named file as an argument.
  • Additional arguments can be passed in on the command line.
  • Default settings are equivalent to R command line arguments --slave --no-restore.

A simple executable script would look like this:

#!/usr/bin/env Rscript
# printArgs.r -- does just what it says

arguments <- commandArgs(trailingOnly=TRUE)
for (i in 1:length(arguments)) {
  print(paste("arg",as.character(i),"=",arguments[i]))
}

Invoking the script is just like any other shell script:

$ chmod +x printArgs.r
$ ./printArgs.r a b 12.3
[1] "arg 1 = a"
[1] "arg 2 = b"
[1] "arg 3 = 12.3"

If you are familiar with R programming, this is really all you need to get started with one caveat — What happens when things go wrong? How can a calling function trap errors and respond appropriately? The rest of this post will address the text strings that are generated for warnings and errors and how they can be modified and redirected. True error handling within R and the use of the tryCatch() function is idiosyncratic enough to warrant a separate post.

Internal settings for warnings and errors

While R has no command line arguments that affect warnings and errors, there are several internal ‘options’ that control the generation of warning messages. You can learn about all of them by typing ?options at the R prompt. Those options related to warnings and errors include:

‘check.bounds’: logical, defaulting to ‘FALSE’.  If true, a
     warning is produced whenever a vector (atomic or ‘list’) is
     extended, by something like ‘x <- 1:3; x[5] <- 6’.

‘error’: either a function or an expression governing the handling
     of non-catastrophic errors such as those generated by ‘stop’
     as well as by signals and internally detected errors.  If the
     option is a function, a call to that function, with no
     arguments, is generated as the expression.  The default value
     is ‘NULL’: see ‘stop’ for the behaviour in that case.  The
     functions ‘dump.frames’ and ‘recover’ provide alternatives
     that allow post-mortem debugging.  Note that these need to
     specified as e.g.  ‘options(error=utils::recover)’ in startup
     files such as ‘.Rprofile’.

‘showWarnCalls’, ‘showErrorCalls’: a logical.  Should warning and
     error messages show a summary of the call stack?  By default
     error calls are shown in non-interactive sessions.

‘show.error.messages’: a logical.  Should error messages be
     printed?  Intended for use with ‘try’ or a user-installed
     error handler.

‘warn’: sets the handling of warning messages.  If ‘warn’ is
     negative all warnings are ignored.  If ‘warn’ is zero (the
     default) warnings are stored until the top-level function
     returns.  If fewer than 10 warnings were signalled they will
     be printed otherwise a message saying how many (max 50) were
     signalled.  An object called ‘last.warning’ is created and
     can be printed through the function ‘warnings’.  If ‘warn’ is
     one, warnings are printed as they occur.  If ‘warn’ is two or
     larger all warnings are turned into errors.

‘warnPartialMatchArgs’: logical.  If true, warns if partial
     matching is used in argument matching.

‘warnPartialMatchAttr’: logical.  If true, warns if partial
     matching is used in extracting attributes via ‘attr’.

‘warnPartialMatchDollar’: logical.  If true, warns if partial
     matching is used for extraction by ‘$’.

‘warning.expression’: an R code expression to be called if a
     warning is generated, replacing the standard message.  If
     non-null it is called irrespective of the value of option
     ‘warn’.

‘warning.length’: sets the truncation limit for error and warning
     messages.  A non-negative integer, with allowed values
     100...8170, default 1000.

The most useful of these is ‘warn’ which can be used to make R shut up about minor stuff:

> options(warn = -1)

Redirecting output

To begin at the beginning we have to know where warning and error messages go when R is first started up. In An Introduction to R we read:

Warning and error messages are sent to the error channel (stderr).

Just as R has a source() function that causes R to accept input from a connection (a file or URL), it has a complimentary sink() function that causes R to redirect output to a connection (file, stderr, or stdout). You can read up with ?source and ?sink but sometimes the best learning happens through experimentation.

Here is a script which experiments with options(warn) and the sink() function.

#!/usr/bin/env Rscript
# redirect.r -- experiments with warning and error messages

# Get any arguments (and ignore them)
arguments <- commandArgs(trailingOnly=TRUE)

# Default
write("1) write() to stderr", stderr())
write("1) write() to stdout", stdout())
warning("1) warning()")

# Ignore all warnings
options(warn = -1)
write("2) write() to stderr", stderr())
write("2) write() to stdout", stdout())
warning("2) warning()")

# Send all STDERR to STDOUT using sink()
options(warn = 0) # default setting
sink(stdout(), type="message")
write("3) write() to stderr", stderr())
write("3) write() to stdout", stdout())
warning("3) warning()")

# Send all STDOUT to STDERR using sink()
sink(NULL, type="message") # default setting
sink(stderr(), type="output")
write("4) write() to stderr", stderr())
write("4) write() to stdout", stdout())
warning("4) warning()")

# Send messages and output to separate files
msg <- file("message.Rout", open="wt")
out <- file("output.Rout", open="wt")
sink(msg, type="message")
sink(out, type="output")
write("5) write() to stderr", stderr())
write("5) write() to stdout", stdout())
warning("5) warning()")

Just copy and paste this script and make it executable and then run it with shell level redirects to observe the behaviour of the sink() function:

./redirect.r         # your terminal will display all output
./redirect.r > out   # your terminal will display only stderr
./redirect.r 2> err  # your terminal will display only stdout

When working together with other programmers in a larger framework, we are sometimes asked to direct certain categories of output to either stderr or stdout for processing by whatever code is invoking our script. Hopefully this example is enough to get you started writing some small, robust, Unix-style lego-bricks of functionality that will play nice in a larger framework.

Postscript

If your Rscript is going to source files that contain S4 class definitions you will run into the following error:

Error: could not find function "setClass"
Execution halted

This is because Rscript does not, by default, load the “methods” package.  Check out the “attached base packages” in each of the following:

R --vanilla -e "sessionInfo()"
...
Rscript --vanilla -e "sessionInfo()"
...

To solve this problem you only need to load the methods package at the beginning of your script with:

require(methods)

 

Series NavigationUsing R — Calling C Code ‘Hello World!’Using R — Working with Geospatial Data
This entry was posted in R and tagged , , , . Bookmark the permalink.

Comments are closed.