<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://organizeseries.com/"
	>

<channel>
	<title>Working With Data</title>
	<atom:link href="http://mazamascience.com/WorkingWithData/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://mazamascience.com/WorkingWithData</link>
	<description>Manning the helm during the data deluge.</description>
	<lastBuildDate>Fri, 17 May 2013 17:00:18 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Using R &#8212; Working with Geospatial Data</title>
		<link>http://mazamascience.com/WorkingWithData/?p=1277</link>
		<comments>http://mazamascience.com/WorkingWithData/?p=1277#comments</comments>
		<pubDate>Sun, 14 Apr 2013 17:28:09 +0000</pubDate>
		<dc:creator>Steven Brey</dc:creator>
				<category><![CDATA[Data Visualization]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[DNR]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[spatial]]></category>
		<category><![CDATA[Washington]]></category>

		<guid isPermaLink="false">http://mazamascience.com/WorkingWithData/?p=1277</guid>
		<description><![CDATA[This entry is part 6 of 12 in the series Using RGIS, an acronym that brings joy to some and strikes fear in the heart of those not interested in buying expensive software. Luckily fight or flight can be saved for another day because you &#8230; <a href="http://mazamascience.com/WorkingWithData/?p=1277">&#160;&#160;read more ... </a>]]></description>
				<content:encoded><![CDATA[<div class="seriesmeta">This entry is part 6 of 12 in the series <a href="http://mazamascience.com/WorkingWithData/?series=using-r" class="series-54" title="Using R">Using R</a></div><p>GIS, an acronym that brings joy to some and strikes fear in the heart of those not interested in buying expensive software. Luckily fight or flight can be saved for another day because you don&#8217;t need to be a GIS jock with a wad of cash to work with spatial data and make beautiful plots. A computer and internet connection should be all you need. In this post I will show how to</p>
<ul>
<li>Get a machine ready to use R to work with geospatial data</li>
<li>Describe what type of data can be used and some of the exciting sources of free GIS data</li>
<li>Use data from the Washington Department of Natural Resources to generate some pretty plots by working through an example script<span id="more-1277"></span></li>
</ul>
<h3>Getting your Machine Ready</h3>
<p>First, if you do not have R on your computer you can download it <a title="here" href="http://www.r-project.org/" target="_blank">here</a>. Even better, download Rstudio, an incredibly efficient and easy to use interface for working with R available <a title="here" href="http://www.rstudio.com/" target="_blank">here</a>.  Working with R studio is highly recommended and will be more clearly outlined in this post. R does not support working with spatial data straight out of the box so there are a couple of packages that need to be downloaded to get R working with spatial data. The two packages required are &#8216;sp&#8217; and &#8216;rgdal&#8217;.  We will also use a third package, &#8216;rgeos&#8217; for some fancy geospatial tricks. Unfortunately, the latest release of the sp package is not compatible with the latest version of R &#8212; v 3.0 at this time. When the Rstudio install prompts to install R, download version 2.15.3.</p>
<p>First intall and open RStudio. To add the required packages open RStudio and click on the &#8220;packages&#8221; tab in the lower right hand panel of the interface. The lower right window will show a list of packages that come with a standard download of RStudio. Some of the packages will have check marks next to them, this means that those libraries are loaded and ready to be used. If you just downloaded R for the first time sp and rdgal will not be on that list, click on the &#8220;Install Packages&#8221; button. Make sure the &#8220;Install from&#8221; option is set to &#8220;Repository (CRAN)&#8221; and type &#8220;sp&#8221; into  the &#8220;Packages&#8221; space. Check the &#8220;Install Dependencies&#8221; option and download! By checking the &#8220;Install Dependencies&#8221; option packages sp needs to function properly will automatically be downloaded. Download rgdal in the same way and you have the tools needed to start!  Download rgeos as well and you can run the portion of the example script that uses centroids.</p>
<h3>Sp and Rgdal abilities and sources of Data</h3>
<p>Rgdal is what allows R to understand the structure of shapefiles by providing functions to read and convert spatial data into easy-to-work-with R dataframes. Sp enables transformations and projections of the data and provides functions for working with the loaded spatial polygon objects. The take away should be that these are powerful libraries that allow R to use .shp files. The <a title="USGS" href="http://nationalmap.gov/" target="_blank">US Geological Survey</a>, the <a href="http://www.nps.gov/gis/data_info/" target="_blank">National Park Serivce,</a> and the Washington State <a href="http://fortress.wa.gov/dnr/app1/dataweb/dmmatrix.html" target="_blank">Department of Natural Resources</a> are a just a few examples of organizations that make enormous stockpiles of spatial data available to the public.</p>
<h3>Example Script</h3>
<p>The following code uses Watershed Resource Inventory Area (WRIA) spatial data from the Washington State Department of Ecology.  This dataset contains information on Washington State&#8217;s different water resource management areas.  Exactly what information is stored in the shapefiles will be explored using R! If a function or any of the code looks mysterious try  &#8221;? mysteriousFunctionName()&#8221; and handy documentation will fill you in on what a function does. Lets start using R to investigate the data.  Just cut and paste the following code into RStudio.</p><pre class="crayon-plain-tag"># WRIA example by Steven Brey @ mazamascience.com

# load the required libraries 
library(sp) 
library(rgdal)

# First load the data from the Washington department of ecology website

# Data source
#
# Data:      ftp://www.ecy.wa.gov/gis_a/hydro/wria.zip
# Metadata:  http://www.ecy.wa.gov/services/gis/data/hydro/wria.htm

# Create a directory for the data
localDir &lt;- 'R_GIS_data'
if (!file.exists(localDir)) {
  dir.create(localDir)
}
# Download the 5mb file of WRIA data
url &lt;- 'ftp://www.ecy.wa.gov/gis_a/hydro/wria.zip'
file &lt;- paste(localDir,basename(url),sep='/')
if (!file.exists(file)) {
  download.file(url, file)
  unzip(file,exdir=localDir)
}
# Show the unzipped files 
list.files(localDir)

# layerName is the name of the unzipped shapefile without file type extensions 
layerName &lt;- "WRIA_poly"  
# Read in the data
data_projected &lt;- readOGR(dsn=localDir, layer=layerName) 

# What is this thing and what's in it?
class(data_projected)
slotNames(data_projected)
# It's an S4 "SpatialPolygonsDataFrame" object with the following slots:
# [1] "data"        "polygons"    "plotOrder"   "bbox"        "proj4string"

# What does the data look like with the default plotting command? 
plot(data_projected)</pre><p><a href="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/wria_default_plot.png"><img class="aligncenter size-full wp-image-1396" alt="wria_default_plot" src="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/wria_default_plot.png" width="550" height="470" /></a></p>
<p>We don&#8217;t know yet what the different &#8216;slots&#8217; contain but Holy Smokes!  That plot looks like Washington. The plot is not pretty (yet) but in just a few lines of code you can already make a primitive map of the polygons in this dataset. The fact that this object is a <em>~DataFrame</em> suggests that we can treat it like an R dataframe.  Lets find out what data it has, select the interesting columns and work with only those.  (Not surprisingly, the @data slot is where the actual dataframe exists but many dataframe methods also work on the entire object.)</p>
<p>We&#8217;ll go to the trouble of renaming variables, reprojecting the data to &#8220;longlat&#8221; and then saving the data to a .RData file for easy loading in the future.</p><pre class="crayon-plain-tag"># Could use names(data_projected@data) or just:
names(data_projected)
#["WRIA_ID"    "WRIA_NR"    "WRIA_AREA_" "WRIA_NM"    "Shape_Leng" "Shape_Area"

# Identify the attributes to keep and associate new names with them
attributes &lt;- c("WRIA_NR", "WRIA_AREA_", "WRIA_NM")
# user friendly names 
newNames &lt;- c( "number", "area", "name")

# Subset the full dataset extracting only the desired attributes
data_projected_subset &lt;- data_projected[,attributes]

# Assign the new attribute names
names(data_projected_subset) &lt;- newNames

# Create a dataframe name (potentially different from layerName)
data_name &lt;- "WRIA"

# Reproject the data onto a "longlat" projection and assign it to the new name
assign(data_name,spTransform(data_projected_subset, CRS("+proj=longlat")))

# NOTE: If using assign() above gave you an error it is likely the version of 
# NOTE: R you are using does not currently support the sp package. Use R 
# NOTE: 2.15.3 instead. 

# The WRIA dataset is now projected in latitude longitude coordinates as a
# SpatialPolygonsDataFrame.  We save the converted data as .RData for faster
# loading in the future.
save(list=c(data_name),file=paste(localDir,"WAWRIAs.RData",sep="/"))

# Upon inspecting the metadata you can see that the first 19 areas in WRIA
# surround Puget Sound. The names of the first 19 watersheds in WRIA are
WRIA$name[1:19]

# For fun, save a subset including only only these 19 areas
PSWRIANumbers &lt;- c(1:19)
WRIAPugetSound &lt;- WRIA[WRIA$number %in% PSWRIANumbers,]
# Sanity check, plot WRIAPugetSound to make sure it looks like the subset we want
plot(WRIAPugetSound)

# Save Puget Sound data as an RData file which we can quickly "load()"
data_name &lt;- "WRIAPugetSound"
save(list=c(data_name),file=paste(localDir,"WRIAPugetSound.RData",sep="/"))</pre><p><a href="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/puget_sound_default.png"><img class="aligncenter size-full wp-image-1394" alt="puget_sound_default" src="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/puget_sound_default.png" width="550" height="470" /></a></p>
<p>We have now saved both WRIA and WRIAPugetSound data as R spatial objects that can easily be loaded! Now the real fun begins, those of you who have been waiting for pretty plots are about to be rewarded. The rest of the script is a walk through of some of the fun analysis and basic figure creating that can easily be done with the converted WRIA data.</p><pre class="crayon-plain-tag"># Load the data that was converted to SpatialPolygonsDataFrame
# NOTE: This can be skipped (but does not have to be) if the spatial
# NOTE: objects are still in your workspace.

# Load and show the names of the attributes in WRIA
file &lt;- paste(localDir,"WAWRIAs.RData",sep="/")
load(file) 
names(WRIA)

file &lt;- paste(localDir,"WAWRIAs.RData",sep="/")
load(file)
names(WRIAPugetSound)

# Sweet.  We can see that this is the WRIA dataset we saved earlier

# NOTE: For more advanced users, slotNames(WRIA) will list the structures 
# in WRIA. Using the @ command allows you to grab a particular slot from the
# spatial object.  If you really want the HUGE AND GORY details of what's
# in this object, you can examine the full structure with str(WRIA).

# Here is how you would extract the contents of slots for easier use.
WriaData &lt;- WRIA@data
WriaBbox &lt;- WRIA@bbox

# We have a pretty good idea of what kind of data we are working with 
# and what it looks like. Now its time for the data to answer some
# questions and tell us a story.

# What is the biggest water resource area in Washington? 
maxArea &lt;- max(WRIA$area)
# Create a 'mask' identifying the biggest area so we can find out its name
# NOTE:  Eaach 'mask' we create is a vector of TRUE/FALSE values that we
# NOTE:  will use to subset the dataframe.
biggestAreaMask &lt;- which(WRIA$area == maxArea)
biggestAreaName &lt;- WRIA$name[biggestAreaMask]
biggestAreaName

# Create a SpatialPolygonsDataFrame subset
biggestArea &lt;- WRIA[biggestAreaMask,]

# plot biggest area in blue with a title
plot(biggestArea, col="blue")
title(biggestAreaName) 

# NOTE: Many more plot arguments can be explored by investigating 
# NOTE: the "SpatialPolygons" "plot-method" in the sp package</pre><p><a href="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/lower_yakima.png"><img class="aligncenter size-full wp-image-1393" alt="lower_yakima" src="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/lower_yakima.png" width="550" height="470" /></a></p><pre class="crayon-plain-tag"># I have heard of a water resource management area in Washington State
# called Pend Oreille.  Where is it located in this dataframe?
which(WriaData$name == "Pend Oreille")

# Now we have isolated the watershed with the largest area as well as the
# fabled Pend Oreille.  Lets figure out how to highlight these regions when
# plotting all  regions. I have also heard that Lake Chelan is Beautiful.
# Lets isolate it as well.

# Each of the following makes a spatialPolygonsDataFrame subset, selecting 
# a specific region based on some selected attribute in WRIA.

WRIA_puget &lt;- WRIA[WRIA$number %in% PSWRIANumbers,]
WRIA_chelan &lt;- WRIA[WRIA$name == "Chelan",]
WRIA_Pend_Oreille &lt;- WRIA[WRIA$name == "Pend Oreille",]

# Check out what they look like plotted individually 
plot(WRIA_puget)
plot(WRIA_chelan)
plot(WRIA_Pend_Oreille)

# For fun we will make 8 different watersheds 8 different colors!
watersheds &lt;- c(1:8)
watershed_colors &lt;- c("burlywood1","forestgreen","burlywood3","darkolivegreen3",
                      "cadetblue4","sienna3","cadetblue3","darkkhaki")

watershed_color_variation &lt;- WRIA[WRIA$number %in% watersheds,]

# Plot some of the created spatial objects together
plot(WRIA)
plot(WRIA_puget,add=TRUE,col="navy")
plot(WRIA_chelan,add=TRUE,col="red2")
plot(watershed_color_variation, add=TRUE, col=watershed_colors)
plot(WRIA_Pend_Oreille,add=TRUE,col="red4")

# NOTE:  gCentroid is from the 'rgeos' package
library(rgeos)
# Find the center of each region and label lat and lon of centers
centroids &lt;- gCentroid(WRIA, byid=TRUE)
centroidLons &lt;- coordinates(centroids)[,1]
centroidLats &lt;- coordinates(centroids)[,2]

# Find regions with center East of -121 longitude (East of Cascade mountains) 
eastSideMask &lt;- centroidLons &gt; -121

# Create spatialPolygonsDataFrame for these regions
WRIA_nonPacific &lt;- WRIA[eastSideMask,]

# Find watersheds with area 75th percentile or greater
basinAreaThirdQuartile &lt;- quantile(WRIA$area,0.75)
largestBasinsMask &lt;- WRIA$area &gt;= basinAreaThirdQuartile

WRIA_largest &lt;- WRIA[largestBasinsMask,]

# To get legend and labels to fit on the figure we can change the size of the
# area plotting. bbox(WRIA) shows the bounding lat and long of WRIA.
bBox &lt;- bbox(WRIA)
ynudge &lt;- 0.5
xnudge &lt;- 3
xlim &lt;- c(bBox[1,1] + xnudge , bBox[1,2] - xnudge)
ylim &lt;- c(bBox[2,1] - ynudge, bBox[2,2] )

# Plot some of the different spatial objects and show lat-long axis, 
# label watersheds
plot(WRIA,axes=TRUE,xlim=xlim,ylim=ylim)
plot(WRIA_nonPacific,add=TRUE,col="red") 
plot(WRIA_puget,add=TRUE,col="navy")
plot(WRIA_largest,add=TRUE,density=20,col="green1")
text(centroidLons, centroidLats, labels=WRIA$number, col="blue", cex=.7)

title(main="Washington Watersheds")
labels &lt;- c("Puget Sound Watersheds", "Washington's biggest watersheds",
            "drain to Pacific via Columbia river") 
label_cols &lt;- c("navy","green1","red")
legend("bottomleft", NULL, labels, fill=label_cols, density, bty="n", cex=.8)</pre><p><img alt="" src="http://mazamascience.com/MazamaScience/wp-content/uploads/2013/04/wa_basic_legend.png" width="0" height="0" /><img alt="" src="http://mazamascience.com/MazamaScience/wp-content/uploads/2013/04/wa_basic_legend.png" width="0" height="0" /></p>
<p><a href="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/wa_basic_legend.png"><img class="aligncenter size-full wp-image-1395" alt="wa_basic_legend" src="http://mazamascience.com/WorkingWithData/wp-content/uploads/2013/04/wa_basic_legend.png" width="550" height="470" /></a></p>
<p>Who needs expensive GIS software?</p>
<p>Another useful package for plotting spatial data is the &#8220;maptools&#8221; library available from from the Cran repository. The code below requires the maptools package so make sure it is installed before running the code.</p><pre class="crayon-plain-tag"># Lets get a taste of what some of the built in plotting functions in the map
# tools package can do

# First load
library("maptools")

allWashingtonAreas &lt;- spplot(WRIA, zcol="area", identify=TRUE)

# by saving as object more than one plot can be shown on a panel
pugetSoundAreas &lt;- spplot(WRIA_puget, zcol="area", identify=TRUE)

# Open new plot frame
frame()
print(allWashingtonAreas, position=c(0,.5,.5,1), more=T)
print(pugetSoundAreas, position=c(.5,.5,1,1), more=T)</pre><p>These examples only scratch the surface of the plotting and analysis potential available in R using the sp, rgdal, and maptools packages.  Exploring the plot-methods in sp is highly recommended before investing a large amount of time into scripting a plot on your own.</p>
<p>Congratulations! You now know how to find out what is in geospatial data and how to make plots using R.</p>
<p>Happy Mapping!</p>
]]></content:encoded>
			<wfw:commentRss>http://mazamascience.com/WorkingWithData/?feed=rss2&#038;p=1277</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[Using R]]></series:name>
	</item>
		<item>
		<title>Best Best Practices Ever!</title>
		<link>http://mazamascience.com/WorkingWithData/?p=1218</link>
		<comments>http://mazamascience.com/WorkingWithData/?p=1218#comments</comments>
		<pubDate>Wed, 13 Feb 2013 17:35:34 +0000</pubDate>
		<dc:creator>Jonathan Callahan</dc:creator>
				<category><![CDATA[Toolbox]]></category>
		<category><![CDATA[best practices]]></category>

		<guid isPermaLink="false">http://mazamascience.com/WorkingWithData/?p=1218</guid>
		<description><![CDATA[Every once in a while I read something that is so insightful, so clearly written and so well documented that it enters my own personal pantheon of &#8220;Best Ever&#8221; documents. I recently added a new, simply divine article titled Best Practices &#8230; <a href="http://mazamascience.com/WorkingWithData/?p=1218">&#160;&#160;read more ... </a>]]></description>
				<content:encoded><![CDATA[<p>Every once in a while I read something that is so insightful, so clearly written and so well documented that it enters my own personal pantheon of &#8220;Best Ever&#8221; documents. I recently added a new, simply divine article titled <a href="http://arxiv.org/pdf/1210.0530v3.pdf">Best Practices for Scientific Computing</a> and hope that everyone reading this post also takes the time to read that article. I&#8217;m including the outline here only to encourage you to read the article in it&#8217;s entirety.  It is extremely well written.<span id="more-1218"></span></p>
<ol>
<li><span class="Apple-style-span" style="line-height: 16px;">Write programs for people, not computers.</span>
<ol>
<li><em>a program should not require its readers to hold more than a handful of facts in memory at once</em></li>
<li><em>names should be consistent, distinctive and meaningful</em></li>
<li><em>code style and formatting should be consistent</em></li>
<li><em>all aspects of software development should be broken down into tasks roughly an hour long</em></li>
</ol>
</li>
<li>Automate repetitive tasks.
<ol>
<li><em>rely on the computer to repeat tasks</em></li>
<li><em>save recent commands in a file for re-use</em></li>
<li><em>use a build tool to automate scientific workflows</em></li>
</ol>
</li>
<li>Use the computer to record history.
<ol>
<li><em>software tools should be  used to track computational work automatically</em></li>
</ol>
</li>
<li>Make incremental changes.
<ol>
<li><em>work in small steps with frequent feedback and course correction</em></li>
</ol>
</li>
<li>Use version control.
<ol>
<li><em>use a version control system</em></li>
<li><em>everything that has been created manually should be put in version control</em></li>
</ol>
</li>
<li>Don&#8217;t repeat yourself (or others).
<ol>
<li><em>every piece of data must have a single authoritative representation in the system</em></li>
<li><em>code should be modularized rather than copied and pasted</em></li>
<li><em>re-use code instead of rewriting it</em></li>
</ol>
</li>
<li>Plan for mistakes.
<ol>
<li><em>add assertions to programs to check their operation</em></li>
<li><em>use an off-the-shelf unit testing library</em></li>
<li><em>use all available oracles when testing programs</em></li>
<li><em>turn bugs into test cases</em></li>
<li><em>use a symbolic debugger</em></li>
</ol>
</li>
<li>Optimize software only after it works correctly.
<ol>
<li><em>use a profiler to identify bottlenecks</em></li>
<li><em>write code in the highest-level language possible</em></li>
</ol>
</li>
<li>Document design and purpose, not mechanics.
<ol>
<li><em>document interfaces and reasons, not implementations</em></li>
<li><em>refactor code instead of explaining how it works</em></li>
<li><em>embed the documentation for a piece of software in that software</em></li>
</ol>
</li>
<li>Collaborate.
<ol>
<li><em>use pre-merge code reviews</em></li>
<li><em>use pair programming when bringing someone new up to speed and when tackling particularly tricky problems</em></li>
</ol>
</li>
</ol>
<p>The only extra I would have included would be:</p>
<p style="padding-left: 30px;">11. Maintain and update older code.</p>
<p>If you are still hesitant to go to the original article, go there for the 67 references to other books and articles that discuss scientific computing.  Like I said, this article is a &#8220;Best Ever&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://mazamascience.com/WorkingWithData/?feed=rss2&#038;p=1218</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using R &#8212; Package Installation Problems</title>
		<link>http://mazamascience.com/WorkingWithData/?p=1185</link>
		<comments>http://mazamascience.com/WorkingWithData/?p=1185#comments</comments>
		<pubDate>Tue, 12 Feb 2013 17:52:14 +0000</pubDate>
		<dc:creator>Jonathan Callahan</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R package]]></category>

		<guid isPermaLink="false">http://mazamascience.com/WorkingWithData/?p=1185</guid>
		<description><![CDATA[This entry is part 3 of 12 in the series Using RThe post titled Installing Packages described the basics of package installation with R.  The process is wonderfully simple when everything goes well.  But it can be maddening when it &#8230; <a href="http://mazamascience.com/WorkingWithData/?p=1185">&#160;&#160;read more ... </a>]]></description>
				<content:encoded><![CDATA[<div class="seriesmeta">This entry is part 3 of 12 in the series <a href="http://mazamascience.com/WorkingWithData/?series=using-r" class="series-54" title="Using R">Using R</a></div><p>The post titled <a title="Using R — Installing Packages" href="http://mazamascience.com/WorkingWithData/?p=728">Installing Packages</a> described the basics of package installation with R.  The process is wonderfully simple when everything goes well.  But it can be maddening when it does not.  Error messages give a hint as to what went wrong but do not necessarily tell you how to resolve the problem.  This post will collect some of the error messages we&#8217;ve encountered while installing R packages and describe the reasons for the error and the workarounds we&#8217;ve found.<span id="more-1185"></span></p>
<h2>1) Older version of R</h2>
<p style="padding-left: 30px;"><code>Warning message:<br />
In install.packages(c("sp")) : package ‘sp’ is not available</code></p>
<p>This is the message that you get when the CRAN package you&#8217;re interested in requires a more recent version of R than you have.  Remember, the default behavior of <code>install.packages()</code> is to grab the latest version of a package.</p>
<p>In this case you have to poke around in the &#8220;Old sources&#8221; link on the CRAN page for that package and use trial-and-error to find an older version of the package that will work with your version of R.  You should start by determining what version of R you have:</p><pre class="crayon-plain-tag">$ R --version
R version 2.8.1 (2008-12-22)</pre><p>This version of R was released at the end of 2008 and any version of the &#8220;sp&#8221; package released in 2008 should work. At least some of the 2009 releases should also work. Perusing the <a href="http://cran.r-project.org/src/contrib/Archive/sp/" target="_blank">sp archive</a>, we might try installing version 0.9-37, the last of the 0.9-3x series which was released in May of 2009:</p><pre class="crayon-plain-tag">$ wget http://cran.r-project.org/src/contrib/Archive/sp/sp_0.9-37.tar.gz
$ sudo CMD INSTALL sp_0.9-37.tar.gz
...
$ # Success!</pre><p></p>
<div></div>
<h2>2) Unable to execute files in /tmp directory</h2>
<p style="padding-left: 30px;"><code>ERROR: 'configure' exists but is not executable -- see the 'R Installation and Administration Manual'</code></p>
<p>By default, R uses the /tmp directory to install packages.  On security conscious machines, the /tmp directory is often marked as &#8220;noexec&#8221; in the /etc/fstab file.  This means that no file under /tmp can ever be executed.  Packages that require compilation or that have self-inflating data will fail with the error above.  One such package is <a href="http://cran.r-project.org/web/packages/RJSONIO/index.html">RJSONIO</a>.</p>
<p>The solution is to set the TMPDIR environment variable which R will use as the compilation directory. For csh shell:</p><pre class="crayon-plain-tag">$ mkdir ~/tmp
$ setenv TMPDIR ~/tmp</pre><p>And for bash:</p><pre class="crayon-plain-tag">$ mkdir ~/tmp
$ export TMPDIR=~/tmp</pre><p></p>
<h2>Other problems</h2>
<p>Please leave comments describing other package installation problems and solutions you&#8217;ve encountered to help us build a more complete listing.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://mazamascience.com/WorkingWithData/?feed=rss2&#038;p=1185</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<series:name><![CDATA[Using R]]></series:name>
	</item>
		<item>
		<title>Using R &#8212; Packaging a C library in 15 minutes</title>
		<link>http://mazamascience.com/WorkingWithData/?p=1151</link>
		<comments>http://mazamascience.com/WorkingWithData/?p=1151#comments</comments>
		<pubDate>Sat, 17 Nov 2012 03:21:58 +0000</pubDate>
		<dc:creator>Jonathan Callahan</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[C library]]></category>
		<category><![CDATA[R package]]></category>

		<guid isPermaLink="false">http://mazamascience.com/WorkingWithData/?p=1151</guid>
		<description><![CDATA[This entry is part 14 of 12 in the series Using RYes, this post condenses 50+ hours of learning into a 15 minute tutorial.  Read &#8216;em and weep.  (That is, you read while I weep.) OK.  For the last week &#8230; <a href="http://mazamascience.com/WorkingWithData/?p=1151">&#160;&#160;read more ... </a>]]></description>
				<content:encoded><![CDATA[<div class="seriesmeta">This entry is part 14 of 12 in the series <a href="http://mazamascience.com/WorkingWithData/?series=using-r" class="series-54" title="Using R">Using R</a></div><p>Yes, this post condenses 50+ hours of learning into a 15 minute tutorial.  Read &#8216;em and weep.  (That is, you read while I weep.)<span id="more-1151"></span></p>
<p title="Using R — Calling C Code ‘Hello World!’">OK.  For the last week I&#8217;ve been learning how to call C code as documented in previous posts: <a title="Using R — Calling C Code ‘Hello World!’" href="http://mazamascience.com/WorkingWithData/?p=1067"> Calling C code &#8220;Hello World!&#8221;</a>, <a title="Using R — .Call(“hello”)" href="http://mazamascience.com/WorkingWithData/?p=1099">.Call(&#8220;hello&#8221;)</a> and <a title="Using R — Callling C code with Rcpp" href="http://mazamascience.com/WorkingWithData/?p=1125">Calling C code with Rcpp</a>.  Now comes that task that puts this hard won knowledge to use &#8212; providing an R interface to a library of C code as part of a package.  You are encouraged to have <a href="http://cran.r-project.org/doc/manuals/R-exts.pdf">Writing R Extensions</a> open in another window and use search whenever you see something you want more information about.  In an effort to keep this exercise to 15 minutes we will stick to the task at hand and not do a whole lot of explaining.  It is assumed that you already have the required compilers installed.</p>
<h2 title="Using R — Calling C Code ‘Hello World!’">1) Download and test your C library</h2>
<p>We&#8217;ll be working with the &#8220;libmseed&#8221; library that reads binary seismic data.  Download the .gz file from the <a href="http://www.iris.edu/software/libraries/">IRIS DMC Softare Library</a> and then compile and test it with the example code.  (Read the README files on your own time.  Right now we have work to do.)</p><pre class="crayon-plain-tag">$ cd ~/Downloads
$ wget http://www.iris.edu/pub/programs/libmseed-2.7.tar.gz
--2012-11-16 16:18:11--&amp;nbsp; http://www.iris.edu/pub/programs/libmseed-2.7.tar.gz
Resolving www.iris.edu... 128.95.166.129
Connecting to www.iris.edu|128.95.166.129|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 166186 (162K) [application/x-gzip]
Saving to: “libmseed-2.7.tar.gz”
...
$ tar -xzf libmseed-2.7.tar.gz 
$ cd libmseed
$ make
cc -c -o fileutils.o fileutils.c
cc -c -o genutils.o genutils.c
cc -c -o gswap.o gswap.c
cc -c -o lmplatform.o lmplatform.c
cc -c -o lookup.o lookup.c
cc -c -o msrutils.o msrutils.c
cc -c -o pack.o pack.c
cc -c -o packdata.o packdata.c
cc -c -o traceutils.o traceutils.c
cc -c -o tracelist.o tracelist.c
cc -c -o parseutils.o parseutils.c
cc -c -o unpack.o unpack.c
cc -c -o unpackdata.o unpackdata.c
cc -c -o selection.o selection.c
cc -c -o logging.o logging.c
rm -f libmseed.a
ar -csq libmseed.a fileutils.o genutils.o gswap.o lmplatform.o lookup.o msrutils.o pack.o packdata.o 
traceutils.o tracelist.o parseutils.o unpack.o unpackdata.o selection.o logging.o
$ cd example
$ make
cc -I.. -c msview.c
cc -I.. -o msview msview.o -L.. -lmseed
cc -I.. -c msrepack.c
cc -I.. -o msrepack msrepack.o -L.. -lmseed
$ ./msview test.mseed
IU_COLA_00_LHZ, 000001, M, 512, 112 samples, 1 Hz, 2010,058,06:50:00.069539
...
IU_COLA_00_LHZ, 000001, M, 512, 27 samples, 1 Hz, 2010,058,07:59:33.069538
$</pre><p>[02:00 minutes]</p>
<p>Congratulations are in order.  (No, not for you!  For the smart folks who wrote libmseed!) ;-)</p>
<h2>2) Set up your package structure</h2>
<p>Yes, we&#8217;re going to do that now.  You could take incremental steps and practice writing an R callable version of example/msview.c and then try to compile it with R CMD SHLIB, then give up and write a Makefile that contains all of the R CMD SHLIB flags.  But I&#8217;m here to tell you that most of that will happen auto-magically if you set up a package first and let R worry about the details.</p>
<p>Here is the minimal package structure you need.  We&#8217;ll create a new directory named &#8220;mseed&#8221; and copy all of the &#8220;libmseed&#8221; code into it:</p><pre class="crayon-plain-tag">$ cd ~
$ mkdir mseed
$ cd mseed
$ mkdir demo
$ mkdir inst
$ mkdir inst/extdata
$ mkdir R
$ mkdir src
$ cp -r ~/Downloads/libmseed ./src</pre><p>[02:40]</p>
<p>(Curiosity might lead a person to search through your already open copy of &#8220;Writing R Extensions&#8221; to figure out what &#8220;extdata&#8221; is used for.  In the interest of time, please do that later.)</p>
<h2>3) Add Makevars to your mseed/src/ directory</h2>
<p>You need to educate R that you are going to compile code that depends upon the libmseed.a library.  The easiest way to do this is by creating a file named mseed/src/Makevars::</p><pre class="crayon-plain-tag"># See Section 1.2.1 "Using 'Makevars'" of Writing R Extensions
# cran.r-project.org/doc/manuals/R-exts.pdf

PKG_CFLAGS=
PKG_CPPFLAGS=-Ilibmseed
PKG_LIBS=-Llibmseed -lmseed

$(SHLIB): libmseed/libmseed.a

libmseed/libmseed.a:
        @(cd libmseed &amp;&amp; $(MAKE) static CC="$(CC)" CFLAGS="$(CFLAGS)")</pre><p>[03:20]</p>
<h2>4) Convert example C code to R callable routine.</h2>
<p>This step assumes that you already know everything learned in the <a title="Using R — .Call(“hello”)" href="http://mazamascience.com/WorkingWithData/?p=1099">.Call(&#8220;hello&#8221;)</a> post.  I took the code in msview.c and whittled it down as much as possible before converting it into an R callable function that spits out, once again, &#8220;Hello World!&#8221;.  Here&#8217;s what I ended up with.  Copy this code to mseed/src/test_hello_mseed.c</p><pre class="crayon-plain-tag">/***************************************************************************
 R callable function
 ***************************************************************************/

#include &lt;R.h&gt;
#include &lt;Rdefines.h&gt;

#include &lt;libmseed.h&gt;

SEXP test_hello_mseed () {

  SEXP result;

  static flag  verbose    = 0;
  static int   reclen     = -1;

  MSRecord *msr = 0;

  int dataflag   = 0;
  int64_t totalrecs  = 0;
  int64_t totalsamps = 0;
  int retcode;

  char *inputfile = "mseed/src/libmseed/example/test.mseed";

  /* Loop over the input file */
  while ( (retcode = ms_readmsr (&amp;msr, inputfile, reclen, NULL, NULL, 1,
                 dataflag, verbose)) == MS_NOERROR )
    {
      totalrecs++;
      totalsamps += msr-&gt;samplecnt;

      printf ("Hello World!\n");
    }

  if ( retcode != MS_ENDOFFILE )
    ms_log (2, "Cannot read %s: %s\n", inputfile, ms_errorstr(retcode));

  /* Make sure everything is cleaned up */
  ms_readmsr (&amp;msr, NULL, 0, NULL, NULL, 0, 0, 0);

  PROTECT(result = NEW_INTEGER(2));
  INTEGER(result)[0] = (int) totalrecs;
  INTEGER(result)[1] = (int) totalsamps;
  UNPROTECT(1);

  return(result);
}</pre><p>[04:30]</p>
<p>You may have noticed that we hardcoded the location of our data file.  Not really following best practices here but we&#8217;re just trying to do the minimum to get it all to work.  We just need to make sure we&#8217;re in the directory above mseed/ when we invoke this function.</p>
<h2>5) Add wrapper code, NAMESPACE and DESCRIPTION</h2>
<p>As described in  <a title="Using R — .Call(“hello”)" href="http://mazamascience.com/WorkingWithData/?p=1099">.Call(&#8220;hello&#8221;)</a> we need to add a wrapper function for our C code.  Create mseed/R/mseedWrappers.R with the following content:</p><pre class="crayon-plain-tag"># wrapper function to invoke test_hello_mseed
test_hello_mseed &lt;- function() {
  result &lt;- .Call("test_hello_mseed")
  return(result)
}</pre><p>Now we need to educate R about the namespace associated with our package.  Create mseed/NAMESPACE with the following:</p><pre class="crayon-plain-tag"># Import required packages
# none so far

# Load dynamic libraries (shared object files)
useDynLib(mseed)

# Export functions defined in R code
export("test_hello_mseed")</pre><p>When R compiles everything, the test_hello_mseed() function written in C will be part of the mseed shared object library.</p>
<p>The last step needed for package compilation is the mseed/DESCRIPTION file:</p><pre class="crayon-plain-tag">Package: mseed
Type: Package
Title: Test wrapper for libseed
Version: 0.0-0
Date: 2012-12-12
Author: Jonathan Callahan &lt;jonathan.s.callahan@gmail.com&gt;
Maintainer: Jonathan Callahan &lt;jonathan.s.callahan@gmail.com&gt;
Description: A package that demonstrates how to wrap a C library using
             the seismic data Mini-SEED library as an example
            (http://www.iris.edu/software/libraries/).
License: GPL (&gt;= 2)
Depends: R (&gt;= 2.14)
Collate: mseedWrappers.R</pre><p>[06:50]</p>
<h2>6) Wave your wand and watch the magic!</h2>
<p>Assuming you have everything in place and have copy-pasted well the following should work if you position yourself above the mseed/ directory.  (Always start with R CMD REMOVE mseed to get rid of any previous version.)</p><pre class="crayon-plain-tag">$ R CMD REMOVE mseed
Removing from library ‘/usr/lib/R/library’
Updating HTML index of packages in '.Library'
Making packages.html&amp;nbsp; ... done
$ R CMD build mseed
* checking for file ‘mseed/DESCRIPTION’ ... OK
* preparing ‘mseed’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘mseed_0.0-0.tar.gz’

$ R CMD INSTALL mseed
* installing to library ‘/usr/lib/R/library’
* installing *source* package ‘mseed’ ...
** libs
gcc -m32 -std=gnu99 -I/usr/include/R -Ilibmseed -I/usr/local/include&amp;nbsp;&amp;nbsp;&amp;nbsp; -fpic&amp;nbsp; -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 
-mtune=atom -fasynchronous-unwind-tables -c test_hello_mseed.c -o test_hello_mseed.o
make[1]: Entering directory `/home/jonathan/mseed/src/libmseed'
gcc -m32 -std=gnu99 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables&amp;nbsp;&amp;nbsp; 
-c -o fileutils.o fileutils.c
...
** building package indices ...
** testing if installed package can be loaded

* DONE (mseed)
Making packages.html&amp;nbsp; ... done
$ R --vanilla
...
&gt; library(mseed)
&gt; test_hello_mseed()
Hello World!
...
Hello World!
[1]   36 4200
&gt;</pre><p>[08:20]</p>
<p><strong>It&#8217;s a miracle!!!</strong></p>
<p>Congratulations are in order.  Go have a coffee or a beer or whatever to kill the rest of your 15 minutes &#8230; unless &#8230;</p>
<p>Unless, that is, you want to tackle actually communicating data between R and this C library.  That will take a few more minutes</p>
<h2>7) Start with a C code example that reads data from a buffer</h2>
<p>Presumably, we&#8217;re packaging our library so that R can pass objects to C code and get objects back.  That means that our C code better know how to read data from a buffer.  We probably want to start out with a C code example that does this.  In the real world you will either have to write the buffer example yourself or ask a friend, preferably the author of the package, to help out.  I started with <a href="http://mazamascience.com/Published/msviewmemory.c">msviewmemory.c</a>.  Better make sure it works:</p><pre class="crayon-plain-tag">$ cd mseed/src/libmseed/example
$ wget http://mazamascience.com/Published/msviewmemory.c
--2012-11-16 16:42:11--&amp;nbsp; http://mazamascience.com/Published/msviewmemory.c
...
$ make msviewmemory
cc -I..&amp;nbsp; -L..&amp;nbsp; msviewmemory.c&amp;nbsp; -lmseed -o msviewmemory
$ ./msviewmemory test.mseed
IU_COLA_00_LHZ, 000001, M, 512, 112 samples, 1 Hz, 2010,058,06:50:00.069539
...</pre><p>[09:30]</p>
<p>So far so good.</p>
<h2>8) Whittle this code down to an R callable routine</h2>
<p>You can try and figure out what the code below does later.  We&#8217;re at risk of going over time so go back to the directory above mseed and just copy and paste the following code into mseed/src/test_hello_mseed_mem.c:</p><pre class="crayon-plain-tag">/*************************************************************************** 
 R callable function to pass raw data to a C test function
 ***************************************************************************/
 #include &lt;R.h&gt;
#include &lt;Rdefines.h&gt;

#include &lt;stdio.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;libmseed.h&gt;

SEXP test_hello_mseed_mem (SEXP buffer) {

  int bufferLength;
  char *bufferPtr;
  SEXP result;

  PROTECT(buffer = AS_RAW(buffer));
  bufferPtr = RAW_POINTER(buffer);
  bufferLength = LENGTH(buffer);

  for (int i=0; i&lt;10; i++) {
    Rprintf("buf[%d]=0x%02x\n",i,bufferPtr[i]);
  }

  Rprintf("bufferLength = %d\n",bufferLength);

  long long int bsize = (long long int) bufferLength;
  long long int boffset = 0;

  static flag  verbose    = 0;
  static int   reclen     = -1;

  MSRecord *msr = 0;

  int dataflag   = 0;
  int64_t totalrecs  = 0;
  int64_t totalsamps = 0;

  /* Loop over the buffer */
  boffset = 0;
  while ( boffset &lt; bsize )
    {
      if ( msr_parse (bufferPtr+boffset, bsize-boffset, &amp;msr, reclen, dataflag, verbose) )
    {
      if ( verbose )
        ms_log (2, "Error parsing record at offset %lld\n", boffset);

      boffset += 256;
    }
      else
    {
      totalrecs++;
      totalsamps += msr-&gt;samplecnt;

          printf ("Hello World!\n");

          boffset += msr-&gt;reclen;
    }
    }

  /* Make sure everything is cleaned up */
  ms_readmsr (&amp;msr, NULL, 0, NULL, NULL, 0, 0, 0);

  PROTECT(result = NEW_INTEGER(3));
  INTEGER(result)[0] = (int) bufferLength;
  INTEGER(result)[1] = (int) totalrecs;
  INTEGER(result)[2] = (int) totalsamps;

  UNPROTECT(2);
  return(result);
}</pre><p>[11:10]</p>
<h2>9) Add a wrapper, external data and a demo</h2>
<p>Add this to mseed/R/mseedWrappers.R:</p><pre class="crayon-plain-tag"># wrapper function to invoke test_hello_mseed_mem
test_hello_mseed_mem &lt;- function(buffer) {
  result &lt;- .Call("test_hello_mseed_mem",buffer)
  return(result)
}</pre><p>Of course, we&#8217;ll also have to add one more line specifying this new function to mseed/NAMESPACE:</p><pre class="crayon-plain-tag">export("test_hello_mseed_mem")</pre><p>Make a copy of test.mseed so that it is found in the normal place for package &#8220;external data&#8221;:</p><pre class="crayon-plain-tag">$ cp mseed/src/libmseed/example/test.mseed ./mseed/inst/extdata/</pre><p>Now add the following demonstration script as mseed/demo/hello.R:</p><pre class="crayon-plain-tag"># Demonstrate successful processing of a miniseed file

library(mseed)

# Read in a mseed file as raw bytes
mseedFile &lt;- system.file("extdata", "test.mseed", package="mseed")
rawData &lt;- readBin(mseedFile, raw(), n=1e+6)

# How many bites do we have?
length(rawData)

# Let's look at the first 10
rawData[1:10]

# Now pass the data to test_hello_mseed_mem()
test_hello_mseed_mem(rawData)</pre><p>and a file named mseed/demo/00Index:</p><pre class="crayon-plain-tag">hello           read mseed file and pass it to a C test function</pre><p>[12:55]</p>
<h2>10) Racing to the finish line</h2>
<p></p><pre class="crayon-plain-tag">$ R CMD REMOVE mseed
...
$ R CMD build mseed
...
$ R CMD INSTALL mseed_0.0-0.tar.gz 
...
]$ R --vanilla
...
&gt; library(mseed)
&gt; demo(package="mseed")
&gt; demo(hello)
...
Hello World!
[1] 18432    36  4200
&gt;</pre><p>[Smack that timer!]<strong> Done.</strong></p>
<p>So, how is your time?  Did you keep up the pace and finish in under 15 minutes?  Or did you get distracted trying to understand what you were doing and go over time.</p>
<p>You should feel good about your accomplishment in either case if you got this far.  (I certainly did!)  Compiling a C library for inclusion with R is not for the faint of heart.  I hope that this baby-steps example makes it seem like it&#8217;s at least possible if perhaps not easy.</p>
<p>Best of luck extending R!</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://mazamascience.com/WorkingWithData/?feed=rss2&#038;p=1151</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<series:name><![CDATA[Using R]]></series:name>
	</item>
		<item>
		<title>Using R &#8212; Callling C code with Rcpp</title>
		<link>http://mazamascience.com/WorkingWithData/?p=1125</link>
		<comments>http://mazamascience.com/WorkingWithData/?p=1125#comments</comments>
		<pubDate>Mon, 12 Nov 2012 18:11:00 +0000</pubDate>
		<dc:creator>Jonathan Callahan</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[Rcpp]]></category>

		<guid isPermaLink="false">http://mazamascience.com/WorkingWithData/?p=1125</guid>
		<description><![CDATA[This entry is part 12 of 12 in the series Using RIn two previous posts we described how R can call C code with .C() and the more complex yet more robust option of calling C code with .Call().  Here &#8230; <a href="http://mazamascience.com/WorkingWithData/?p=1125">&#160;&#160;read more ... </a>]]></description>
				<content:encoded><![CDATA[<div class="seriesmeta">This entry is part 12 of 12 in the series <a href="http://mazamascience.com/WorkingWithData/?series=using-r" class="series-54" title="Using R">Using R</a></div><p>In two previous posts we described how R can <a title="Using R — Calling C Code ‘Hello World!’" href="http://mazamascience.com/WorkingWithData/?p=1067">call C code with .C()</a> and the more complex yet more robust option of <a title="Using R — .Call(“hello”)" href="http://mazamascience.com/WorkingWithData/?p=1099">calling C code with .Call()</a>.  Here we will describe how the <a href="http://cran.r-project.org/web/packages/Rcpp/index.html">Rcpp package</a> can be used to greatly simplify your C code without forcing you to become expert in C++.<span id="more-1125"></span></p>
<p>First off, kudos to Dirk Eddelbuettel and <em></em>Romain François who&#8217;s tireless efforts to improve, promote and document Rcpp have produced one of CRAN&#8217;s most popular packages.  As of Rcpp version 0.9.15 there are 82 &#8220;Reverse depends&#8221; &#8212; other packages that utilize Rcpp.  There are also eight vignettes that describe the package in human terms.  Even more accessible are Dirk&#8217;s <a href="http://dirk.eddelbuettel.com/bio/papers.html">papers</a> and <a href="http://dirk.eddelbuettel.com/bio/presentations.html">presentations</a> and Romain&#8217;s <a href="http://romainfrancois.blog.free.fr/">blog</a>.  While we&#8217;re heaping praise, let&#8217;s not forget to mention Romain&#8217;s <a href="http://gallery.r-enthusiasts.com/">graph gallery</a>.</p>
<p>Even after all that praise and all the available documentation we&#8217;re still left with the problem of where to start.  There are no vignettes targeted at the R user who wants to try out a couple of C routines but isn&#8217;t otherwise inclined to learn C++ and who doesn&#8217;t want to write an entire package &#8212; at least not yet.  A good place to review the motivation behind Rcpp and see some great example code would be Dirk&#8217;s 62 slide  <a href="http://dirk.eddelbuettel.com/papers/rcpp_rfinance_may2012.pdf">Rcpp: Seamless R and C++ Integration</a> presentation or the longer, 146 slide <a href="http://dirk.eddelbuettel.com/papers/rcpp_workshop_introduction_user2012.pdf">Rcpp Tutorial Parts I and II</a>.  Between those two presentations and the package documentation you really have access to all the information you need.  However, in the interest of continuing our &#8220;baby steps&#8221; theme, we will once again recreate our three &#8220;Hello World!&#8221; examples, this time using Rcpp.</p>
<h2>Getting Set Up<strong><br />
</strong></h2>
<p>The very first thing you will have to do is install the <a href="http://cran.r-project.org/web/packages/Rcpp/index.html">Rcpp package</a>.  (See <a title="Using R — Installing Packages" href="http://mazamascience.com/WorkingWithData/?p=728">Installing Packages</a> if this is unfamiliar.)  If you keep your version of R at the bleeding edge, you can do this by invoking R and telling it to just grab the latest package.</p><pre class="crayon-plain-tag">$ R --vanilla

R version 2.14.1 (2011-12-22)
Copyright (C) 2011 The R Foundation for Statistical Computing
...
&gt; install.packages( repos=c('http://cran.fhcrc.org/'), pkgs=c('Rcpp') )
Installing package(s) into ‘/usr/lib/R/library’
(as ‘lib’ is unspecified)
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘Rcpp’ is not available (for R version 2.14.1)</pre><p>Oops!  Some of us are slightly behind the times.  Rather than upgrade my version of R today, I&#8217;ll instead take a look inside the Rcpp <a href="http://cran.r-project.org/src/contrib/Archive/Rcpp/">Old sources: archive</a> and find a version that was released close to the (2011-12-22) date of my version of R.  Version 0.9.10 was released on 17-Feb-2012 and should be compatible.  We&#8217;ll install that version from the command line with the following:</p><pre class="crayon-plain-tag">$ wget http://cran.r-project.org/src/contrib/Archive/Rcpp/Rcpp_0.9.10.tar.gz
...
$ sudo R CMD INSTALL Rcpp_0.9.10.tar.gz
[sudo] password for jonathan: 
* installing to library ‘/usr/lib/R/library’
* installing *source* package ‘Rcpp’ ...
...
** testing if installed package can be loaded

* DONE (Rcpp)
Making packages.html  ... done</pre><p>The last thing we need to do is let our compilers and linkers know where the new, Rcpp libraries are located (see slide 48 of the <a href="http://dirk.eddelbuettel.com/papers/rcpp_workshop_introduction_user2012.pdf">Tutorial</a>).</p><pre class="crayon-plain-tag">$ export PKG_CPPFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
$ export PKG_CPPFLAGS=`Rscript -e "Rcpp:::LdFlags()"`</pre><p>Yes, Whew! once again.  But now we are ready to write C code that looks more like C code and, once we embrace <a href="%20http://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-sugar.pdf">Rcpp&#8217;s syntactic sugar</a>, perhaps even like R code.</p>
<h2>Baby steps example</h2>
<p>When using the Rcpp package we are still using the .Call() interface to C code.  All of the changes will be seen in the C code which now becomes C++ code with a <span style="font-family: courier new,courier;">.cpp</span> extension.  Because C++ is a superset of C, this is perfectly legal but we will need to educate Rcpp by using <span style="font-family: courier new,courier;">RcppExport</span>.  According to slide 136 of the Tutorial:</p><pre class="crayon-plain-tag">* note : RcppExport is an alias to ‘extern "C"‘ defined by Rcpp.
*
* It gives C calling convention to the [...] function so that
* it can be called from .Call in R. Otherwise, the C++ compiler mangles the
* name of the function and .Call can’t find it.</pre><p>OK.  Here we go with our first  C++ example: <span style="font-family: courier new,courier;">helloA2.cpp</span>.</p><pre class="crayon-plain-tag">#include &lt;Rcpp.h&gt;
RcppExport SEXP helloA2() {
  printf("Hello World!\n");
  return(R_NilValue);
}</pre><p>It looks remarkably similar to <span style="font-family: courier new,courier;">helloA1.c</span> from the previous post and is compiled and invoked in much the same way.</p><pre class="crayon-plain-tag">R CMD SHLIB helloA2.cpp
g++ -m32 -I/usr/include/R -L/usr/lib/R/library/Rcpp/lib -lRcpp -Wl,-rpath,/usr/lib/R/library/Rcpp/lib 
-I/usr/local/include   -I/usr/lib/R/library/Rcpp/include -fpic  -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 
-mtune=atom -fasynchronous-unwind-tables -c helloA2.cpp -o helloA2.o
g++ -m32 -shared -L/usr/local/lib -o helloA2.so helloA2.o -L/usr/lib/R/library/Rcpp/lib -lRcpp 
-Wl,-rpath,/usr/lib/R/library/Rcpp/lib -L/usr/lib/R/lib -lR</pre><p>The R wrapper code is identical to the one in the previous post for <span style="font-family: courier new,courier;">helloA1.c</span> as is the usage in an R session.</p><pre class="crayon-plain-tag"># call RCpp C code
dyn.load("helloA2.so")
helloA2 &lt;- function() {
  result &lt;- .Call("helloA2")
}</pre><p></p><pre class="crayon-plain-tag">&gt; source('wrappers.R')
&gt; greeting &lt;- helloA2()
Hello World!
&gt; class(greeting)
[1] "NULL"</pre><p></p>
<h2>Simpler C code</h2>
<p>We&#8217;ll leave out the wrappers and R session in the next two examples as they are identical to the examples in the <a title="Using R — .Call(“hello”)" href="http://mazamascience.com/WorkingWithData/?p=1099">previous post</a>.  But just look at how much simpler the C code gets!  Instead of allocating memory, protecting from garbage collection and all that casting between types we get this for <span style="font-family: courier new,courier;">helloB2.cpp</span>:</p><pre class="crayon-plain-tag">#include &lt;Rcpp.h&gt;
RcppExport SEXP helloB2() {
  Rcpp::StringVector result(1);
  result[0] = "Hello World!";
  return(result);
}</pre><p>and this for <span style="font-family: courier new,courier;">helloC2.cpp</span>:</p><pre class="crayon-plain-tag">#include &lt;Rcpp.h&gt;
RcppExport SEXP helloC2(SEXP greetingPointer) {
  Rcpp::StringVector greeting(greetingPointer);
  Rcpp::NumericVector result(greeting.size());
  for (int i=0; i&lt;greeting.size(); i++) {
    result[i] = strlen(greeting[i]);
  }
  return(result);
}</pre><p>Now we&#8217;re talkin&#8217;.  This is starting to look like code a C programmer, Heck, even a Java programmer could get comfortable with.  The tools provided by Rcpp are systematic, readable and, as a welcome change, very well documented.  Whereas I was hesitant to recommend writing C code for the .Call() interface because of the painful learning curve, I am happy to report that the learning curve for using Rcpp does <strong>not</strong> require that you first climb a mountain to learn C++.  C programmers of all skill levels will benefit from using Rcpp.</p>
<h2>More Examples</h2>
<p>While a &#8220;Hello World!&#8221; example may be a great place to start, it is unlikely to provide any useful template code for people.  For that you will want to poke around in the source code of the many packages that depend on Rcpp.  Dirk Eddelbuettel is of course quite interested in these dependent packages and describes some of them on slide 42 of <a href="http://dirk.eddelbuettel.com/papers/rcpp_rfinance_may2012.pdf">Seamless R and C++ Integration</a> and slide 29 of <a href="http://dirk.eddelbuettel.com/papers/rcpp_workshop_introduction_user2012.pdf">Rcpp Tutorial Parts I and II</a>.  I was pleased to learn that some of these packages are written in C and will hopefully provide excellent example code.</p>
<p>Hadley Wickham has written a <a href="https://github.com/hadley/devtools/wiki/Rcpp" target="_blank">comprehensive tutorial for the Rcpp package</a>.</p>
<p>To everyone who is trying to improve and extend R &#8212; Best of Luck!</p>
]]></content:encoded>
			<wfw:commentRss>http://mazamascience.com/WorkingWithData/?feed=rss2&#038;p=1125</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<series:name><![CDATA[Using R]]></series:name>
	</item>
	</channel>
</rss>
