>*The pipe operator in R also sort of does my head in.* As someone who taught my...

>The pipe operator in R also sort of does my head in.

As someone who taught myself base R from scratch in 2013, I (used to!) agree with this. When the pipe operator was first introduced, I’d roll my eyes whenever I saw a script that used it and move along.

But I forced my brain to adapt, and now it’s probably my favorite feature of the R language. Data science is full of sequences of transformations, and in my opinion it’s more readable and bug-resistant to phrase these long chains as:

  f(x) %>% g() %>% h()

rather than:

  h(g(f(x)))

or certainly:

  foo <- f(x) 
  foo <- g(foo)
  foo <- h(foo)

I can comprehend and modify others’ (to include past versions of me) R code much more quickly with this paradigm. You can quickly debug a chain by commenting out functions sequentially (i.e. first test: “f(x) # %>% ...”). It also becomes much faster to plug new transformations into the chain, when needed.

One thing that helps to keep track of the input x as it moves through the chain is using the “.” placeholder (especially when you need to specify function arguments), like so:

  f(x) %>% 
    g(., n=100, param=“baz”) %>%
    mean(.$column_name)

Here, the . stands in for “whatever is coming out of the pipe” from the left.