Dplyr 0.7.0 has been published. One of the greatest improvements is the enhancement for standard evaluation.

Let’s look at an example. Let’s say I want to apply a function to a column of a data.frame. But the name of the data.frame can change from call to call. So the actual column is the value of a string.

Earlier to dplyr 0.7.0 I used the following methods using lazyeval:

1
2
3
4
5
library(dplyr, warn.conflicts = FALSE)
library(lazyeval)
 
data <- data.frame(a=c(1,2,3), b=c("aaaa", "bb&bb", "ccccc"), stringsAsFactors=FALSE)
data
1
2
3
4
##   a     b
## 1 1  aaaa
## 2 2 bb&bb
## 3 3 ccccc
1
2
3
4
5
6
7
8
# key holds the name of the column to be changed
key <- "b"
 
# mutate_call holds the function to replace each & by \&
mutate_call <- lazyeval::interp(~gsub("&", '\\\\&', var), var=as.name(key))
 
data %>%
  mutate_(.dots = setNames(list(mutate_call), key))
1
2
3
4
5
## Warning: `mutate_()` is deprecated as of dplyr 0.7.0.
## Please use `mutate()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
1
2
3
4
##   a       b
## 1 1    aaaa
## 2 2 bb\\&bb
## 3 3   ccccc

Calling lazyeval::interp feels a little clumsy.

Starting dplyr 0.7.0 you can do it this way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
library(dplyr)
 
# Same data as before
data <- data.frame(a=c(1,2,3), b=c("aaaa", "bb&bb", "ccccc"), stringsAsFactors=FALSE)
key <- "b"
 
# Define a simple function doing the replacement
my_func <- function(var_in) {
  var_out <- gsub('&', '\\\\&', var_in)
  return(var_out)
}
 
# Call mutate with some syntactic sugar:
data %>%
  mutate(!!key := my_func(.data[[key]]))
1
2
3
4
##   a       b
## 1 1    aaaa
## 2 2 bb\\&bb
## 3 3   ccccc

As you can see the new way is lot easier to read and understand. I use three syntax changes here:

  • First !!key is used to “unquote” the name of the new (here the old name) column.
  • Second := is used to assign the value of the right hand side to the new column.
  • Third .data[[key]] is used to access the column of the data.frame.

All this and much is more is explained here.