When someone asks me what’s the main advantage of R over python I almost always
answer: “It’s dplyr! The way you can handle data.frames is a dream.”
Pythonista: “But pandas has also Data.Frames. They are built to resemble their
counterparts in R.”
Me: “That’s true. But they manage the functionality of plain R. Actually R has made
several steps ahead with dplyr.”
But dplyr is built mainly for interactive data exploration. So it’s very easy to
select, mutate, group and summarize your data.frame (or tibble).
The reason is non-standard evaluation (NSE) (See more
in Hadley Wickham’s book Advanced R. NSE occures when
you use a column-name without any quoting.
But when it’s
up to programming it get’s a little more complicated.
So let’s look at an example: