Archive for the 'R' Category

Milestones for epanetReader

Nearly two years ago, I released a package for reading water network data into R. The package, called epanetReader has had a few milestones recently so I thought it would be good to celebrate with a blog post.

Featured in IBM’s Environment Report

The epanetReader package makes IBM’s corporate environmental report for 2016. You can read all about it on page 43.

Now in 100 Countries

In June 2017, Uruguay became the 100th country where epanetReader was downloaded from RStudio’s CRAN server.  Welcome to the colleagues in Uruguay! The total number of downloads from this server is now over 6000.  There are about 120 other CRAN servers that do not provide logs so the total number of downloads is anyone’s guess.

New Release

Version 0.5.1 was released to CRAN on 26 May 2017.  This version adds a new plotting option to include labels of elements in the network, similar to the windows graphical interface.  Labels are enabled like this:

plot(Net1, plot.labels = TRUE)

The new release also includes the test suite as part of the CRAN package. Previously, the test suite was only available on github. Now the tests are fully part of the package and run as part of the package checking process. Code coverage is currently at 94%.  Contributions of new tests to improve the  coverage are very welcome.

epanetReader now supports EPANET-MSX and Sparklines

epanetReader v0.3.1 is now available on CRAN.  This new version has several updates and enhancements:

  • Added read.msxrpt() function to read results of multi-species water quality simulations made with EPANET-MSX
  • Added plotSparklineTable() function to generate tables of sparklines for exploratory analysis
  • Added data objects Net1 and Net1rpt in rdata format so some examples run without having to parse the text files.
  • Changed treatment of some columns of sections of .inp files from character to factor to improve summaries (e.g. Pipe Status, Valve Type)
  • Added support to read Status and Demands sections of .inp files
  • Plotting changes for simulation results: fixed bug for plotting with valves, changed aspect ratio to 1 for maps

Update to the version or install for the first time by running the following command in an R session.
> install.packages("epanetReader")

Analyze Water Networks using Epanet and R with epanetReader

Epanet is a popular software for water network simulation freely available from the US Environmental Protection Agency.  R is a popular and freely available environment for computing and graphics.

epanetReader is a new add-on package for R that reads Epanet files for further analysis and visualization.  As of today, epanetReader is now available through the R archive (CRAN) in addition to GitHub. For those already familiar with Epanet and R, get started by installing the package.

> install.packages("epanetReader")  
> library(epanetReader) 

Read in your favorite network file this way

 > n1 <- read.inp("Net1.inp") 

Get a quick summary of the network using the summary() function
> summary(n1)

Look at the package page on GitHub github.com/bradleyjeck/epanetReader or the following help functions for further examples of package usage.
> help(read.inp)
> help(read.rpt)

If you are interested in water networks, but new to R or Epanet here are a few
resources to get started.

Feedback on the package is most welcome! Details are on the Contact page.

Writing Simple R Packages

I’ve been using R for a while now — since 2008!?!  But I’ve only recently started writing simple packages. From another user’s perspective, a package is the easiest way to use code someone else wrote.  Happily, getting from scripts that work to a simple package is a short journey.

Compared to a set of scripts, packages are easier for others to use because of documentation and namespace.  If you are looking at this page, you have probably spent some time reading documentation for R.  I’m sure you’ll agree that documented functions are much easier to use than undocumented ones. Namespace is a somewhat more obscure concept in R programming.  Namespaces specify how R looks for things and in this way control the visibility of functions. In packages, namespaces make some functions easily accessible while making other functions less visible. It’s a bit like the difference between public and protected members of a java class.  As shown in the example below, putting a few special comment lines in your existing code will sort out both documentation and namespace.

There is a lot of material online already about what R packages are and different perspectives on how to write them. Three references in particular come up frequently:

In this post, I describe a workflow of developing simple R packages that is heavily influenced by Wickham’s work. There is also an illustrative example.  By “simple” package, I mean that there is no compiled code as part of the package — the code you want to share is pure R.

Workflow

1. Create a directory structure

Packages are a collection of files organized into directories. A simple package needs only a few directories.

package-name
    + man
    + R
    + tests
    DESCRIPTION

The top level folder is the name of your package.  The man folder will contain files for making the package manual.  The R folder holds the source code for your package. A directory of tests contains tests of the code in your package. The DESCRIPTION file gives a short description of the package.

2. Develop and test your code

The staring point for most packages is a set of scripts that work. Put these in the R folder. Unit tests are indispensable  to make sure that the code actually works the way you think it does.  These go in the tests folder. A unit testing framework such as the testthat package helps to organize and run these.

The way this works for me is to open an R session and make package-name/tests the working directory.  As I write a function in the R folder, I run tests from the tests folder.

3. Create Documentation & Namespace

Documentation for R packages should be in the .rd format. The roxygen2 package creates documentation in the .rd format automatically from the comments in your source files.  It also creates a namespace file that tells R about the visibility of the package functions.

> setwd("package-name")
> library(roxygen2)
> roxygenize(".")

4. Build and check the package

Once the package has code and documentation, it can be built and checked from the shell using the Rtools.

$ R CMD build path/to/package-name
$ R CMD check package-name_x.y.z.tar.gz

5. Share

After your package passes R CMD check, you can have good confidence that the package will behave in a reasonable way for your users.

 

Example

A simple example helps illustrate this process.  Let’s say that I want to share functions to compute the geometric and harmonic mean of a vector of numbers.  Along with the arithmetic mean, these are the three classical Pythagorean means.  But only the arithmetic mean is built into base R.  So I’m going to make a package called pythagmeans with these functions.

1. Create directory structure

In addition to the directory structure, you also need a description file.

pythagmeans
    + man
    + R
    + tests
    DESCRIPTION

A minimal description file looks like this.


Package: pythagmeans
Type: Package
Title: Pythagorean Means
Version: 0.0.1
Date: 2015-06-16
Author: Bradley J. Eck
Maintainer: Bradley Eck <brad@bradeck.net>
Description: Compute the three classical Pythagorean means: arithmetic, geometric, and harmonic.
License: MIT

2. Develop and test

This is a small package so I have only two source files: the package source and a file of tests.

The package source file has five functions. There is one function for each type of mean (arithmetic, geometric, harmonic) and there are two helper functions.  The helper functions check the agrument and give errors.  The comments character #’ denotes a roxygen comment that is parsed for use in building the documentation. The @export tag notes that the function is made available to the package user.  Note that the utility functions are not exported.


# File:  pythagmeans/R/PythagoreanMeans.r 

#' Arithmetic Mean
#'
#' Computes the arithmetic mean of a vector of numbers
#'
#' @export 
#' @param x vector of numbers without NAs
arithmetic_mean <- function( x ) { 
  argCheck(x)
  am <- mean(x) # used the built-in function
  return( am )
}

#' Geometric Mean
#'
#' Computes the geometric mean of a vector of numbers
#'
#' @export 
#' @param x vector of numbers without NAs
geometric_mean <- function ( x ) {
  argCheck(x)
  n <- length(x) 
  gm <- prod(x)^(1/n)
  return( gm ) 
}

#' Harmonic Mean
#'
#' Computes the harmonic mean of a vector of numbers
#'
#' @export 
#' @param x vector of numbers without NAs
harmonic_mean <- function( x ) {
  argCheck(x)
  n <- length(x) 
  hm <- n / sum( reciprocal( x ) ) 
  return( hm ) 
}

# confirm argument is numeric and without NAs 
argCheck <- function( x ){ 
   if( is.numeric(x) == FALSE ){ stop("argument must be numeric")} 
   if( max( is.na(x) ) > 0 ){ stop("NA values not allowed") }
}

# a helper function  to compute reciprocals check if a zero is present  
reciprocal <- function( x ) { 
  if(  max( x == 0 ) > 0 ){ stop("zeros not allowed in x") }
  recip <- 1/x
  return (  recip ) 
}

The file of test code looks like this. To run the tests just call >test_dir(".") from an R session in the \tests directory.


# File: pythagmeans/tests/test_PythagoreanMeans.r

library(testthat)

# assumes /tests is the working directory  
source("../R/PythagoreanMeans.r")

context("verify means")
test_that("arithmetic mean is correct",{
  x <- c(1,2,3)
  expect_equal( arithmetic_mean(x), expected = 2 ) 
})

test_that("geometric mean is correct",{  
  x <- c(1,2,3)
  expect_equal( geometric_mean(x), expected = 1.817121, 
                tolerance = 0.000001,  scale = 1 )
})

test_that("harmonic mean is correct",{ 
  x <- c(1,2,3)
  expect_equal( harmonic_mean(x), expected = 1.636364, 
                tolerance = 0.000001, scale = 1 ) 
})

context("throwing errors") 
test_that( "error on NA entry",{
  x<- c(1,NA,3)
  expect_error( arithmetic_mean(x), "NA values not allowed" )
})

test_that( "error on 0 entry for harmonic_mean",{
  x<- c(1,0,3)
  expect_error( harmonic_mean(x), "zeros not allowed" )
})

3. Generate documentation

With the roxygen2 package, generating documentation becomes automatic. The NAMESPACE file is also generated automatically.


> setwd( "/pythagmeans")
> library(roxygen2)
> roxygenize(".")

4. Build & check

Once the tests pass and the documentation is generated you're ready to build and check the package.


$ R CMD build path/to/pythagmeans
$ R CMD check pythagmeans_0.0.1.tar.gz

5. Share, Install, Use

These functions are now easily shared with colleagues as the archive file built built by R CMD build.

$ R CMD INSTALL pythagmeans_0.0.1.tar.gz

After installation the exported functions are available by loading the package in an R session.

> library(pythagmeans)