Use `purrr` to feed four cats

Replacing a for loop with purrr::map_*

Use purrr to feed four cats

In this example we will show you how to go from a ‘for loop’ to purrr. Use this as a cheatsheet when you want to replace your for loops.

Imagine having 4 cats. (like this one:)

here is one cate Four real cats who need food, care and love to live a happy life. They are starting to meow, so it’s time to feed them. Our real life algorithm would be:

  • give food to cat 1
  • give food to cat 2

But life is not that simple. Some of your cats are good cats and some are bad. You want to spoil the good cats and give normal food to the bad cats (Clearly the authors do not have cats in real life because Cats will complain no matter what).

Elements of the four cats problem

Let’s work out this ‘problem’ in R. In real life we would have to have some place for the cats and their bowls. In R the place for cats and their bowls can be a data.frame (or tibble).

Let’s start with the cats. The cats have a name and a temperament: they are good (❤️) or bad (⚡). Let’s call that goodness1.

library(tibble)
library(dplyr)
library(purrr)
cats <- tribble(
    ~Name, ~Good,
    "Suzie", TRUE,
    "Margaret", FALSE,
    "Donald", FALSE,
    "Tibbles", TRUE
    )

as_cat <- 
    function(name){
        paste(emo::ji("cat"), name)
    } 

as_temperament <- 
    function(good) {
        ifelse(good, emo::ji("heart"), emo::ji("lightning"))
}

cats$Name <- as_cat(cats$Name)
cats$Good <- as_temperament(cats$Good)

cats
## # A tibble: 4 x 2
##   Name        Good 
##   <chr>       <chr>
## 1 🐱 Suzie    ❤️   
## 2 🐱 Margaret 🌩   
## 3 🐱 Donald   🌩   
## 4 🐱 Tibbles  ❤️

Now we have to feed the cats. Only good cats get premium quality super-awesome-food. Bad cats get normal food.

If you have used dplyr or ifelse statements before, this example is super silly and inefficient. But for people coming from different programming languages and people who started out with for loops, this is the approach you might use.

To feed the cat, we have to determine if the cat is Good or not. We don’t want to type to much, we are lazy data scientist after all. So we create a function to do our work.

good_kitty <- 
    function(x) {
    if(x == emo::ji("heart")) {
        paste0(emo::ji("cake")," Premium food")
    } else {
        paste0(emo::ji("bread"), " Normal food")
    }
    }

good_kitty(cats$Good[1])
## [1] "\U0001f370 Premium food"
# alternative function name
# feed_with <- function(temperament) {
#     if (temperament == as_temperament(T)) { # 
#         paste0(emo::ji("cake")," Premium food")
#     } else {
#         paste0(emo::ji("bread"), " Normal food")
#     }
# }

We also want to cuddle our cats, unfortunately some of the cats don’t want to be pet. Suzie and Donald don’t want to be cuddled. So let’s create a function that takes care of that.

pet_cat <- 
    function(name){
    if(name %in%  as_cat(c("Margaret", "Tibbles"))){
        "pet"
    }else{
        "don't pet, stay away!"
    }
}
pet_cat(as_cat("Donald"))
## [1] "don't pet, stay away!"
pet_cat(as_cat("Tibbles"))
## [1] "pet"

Assigning food and love to kitties

There’s two things we want to do. Give food and give cuddles.

For every row we want to do two things:

  • give food based on Goodness
  • pet based on name

This is something we would want to use a ‘for loop’ for. If you have programmed with C(++) or python, this is a natural approach. And it is not wrong to do this in R too.

# First we create place for the Food to go into
# and for the petting to take place. 
cats$Food <- NA
cats$Petting <- NA

Then we start a ‘for loop’:

for (cat in 1:nrow(cats)){
    cats$Food[cat] <- good_kitty(cats$Good[cat])
    cats$Petting[cat] <- pet_cat(cats$Name[cat])
}
cats
## # A tibble: 4 x 4
##   Name        Good  Food            Petting              
##   <chr>       <chr> <chr>           <chr>                
## 1 🐱 Suzie    ❤️    🍰 Premium food don't pet, stay away!
## 2 🐱 Margaret 🌩    🍞 Normal food  pet                  
## 3 🐱 Donald   🌩    🍞 Normal food  don't pet, stay away!
## 4 🐱 Tibbles  ❤️    🍰 Premium food pet

A ‘for loop’ is you telling your computer: ‘I will only tell you this once, but you must do it to everything in the collection’. To make that work the computer needs to know what the iterator part is: (something in collection) in this case (cat in 1:nrow(cats)), what needs to happen inside the loop: {actions}, and the use of a variable that takes a different value every iteration something or in this case cat. People often use i for this value for (i in list) but as you can see, it really doesn’t matter.

Let’s get into a bit more details

What’s going on in the for loop?

inside the loop we see cats$Food[cat], we’re taking the Food column in the cats data.frame. And then we index [] it. cats$Food[1] would take the first item in this collection, cats$Food[2] the second, etc.

Were does the index come from?

For loops are dumb. They don’t know and don’t care about the collection they are moving through. The only thing they know is that they move through every element of the collection until the end. In our case the collection is 1:nrow(cats), a sequence from 1 to the number of rows in the data.frame.

When your ‘for loop’ doesn’t behave as you expect the best way to debug it is to print the index at the start of the loop.

Functional approach

For loops are very often used and very handy. But they do require a bit of mental overhead, you need to think about the collection you are iterating over and use the correct indices to put stuff in. It’s easy to get lost.

With the purrr package we approach the problem from another direction: we think about small steps. We want to execute the pet and feeding functions on every row, we do not care about the specifics of the row number.

In the purrr style of thinking (functional programming) we want to ‘map’ the function to the data. We do care about the outcome of every execution, we want that to be character.
Before showing the purrr solution we remove Food and Petting from cats (this is necessary evil, of if this is really upsetting to you, we wait for half a day before starting again).

# reset the Food and Petting
cats$Food <- NULL
cats$Petting <- NULL

Let’s use map_chr

# execute on every element of cats$Good the good kitty function and 
# then cats$Name on pet_cat function
cats$Food <- map_chr(cats$Good, good_kitty)
cats$Petting <- map_chr(cats$Name, pet_cat)
cats
## # A tibble: 4 x 4
##   Name        Good  Food            Petting              
##   <chr>       <chr> <chr>           <chr>                
## 1 🐱 Suzie    ❤️    🍰 Premium food don't pet, stay away!
## 2 🐱 Margaret 🌩    🍞 Normal food  pet                  
## 3 🐱 Donald   🌩    🍞 Normal food  don't pet, stay away!
## 4 🐱 Tibbles  ❤️    🍰 Premium food pet

The results are the same.

Resources

Just when we were finishing up this post there was an excellent tweet thread on purrr resources!

Complete list at moment of writing

This post is a created by Roel & Reigo, with lots of help by Sathira

I met Sathira and Reigo at the satRday in Amsterdam this september 2018. We worked on this project during the open source day.

Head image is from shttps://pixabay.com/en/cat-pet-feed-orange-cute-3469551/

State of the machine

At the moment of creation (when I knitted this document ) this was the state of my machine: click here to expand

sessioninfo::session_info()
## ─ Session info ──────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 3.5.1 (2018-07-02)
##  os       macOS High Sierra 10.13.6   
##  system   x86_64, darwin15.6.0        
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  tz       Europe/Amsterdam            
##  date     2018-09-10                  
## 
## ─ Packages ──────────────────────────────────────────────────────────────
##  package     * version    date       source                         
##  assertthat    0.2.0      2017-04-11 CRAN (R 3.5.0)                 
##  backports     1.1.2      2017-12-13 CRAN (R 3.5.0)                 
##  bindr         0.1.1      2018-03-13 CRAN (R 3.5.0)                 
##  bindrcpp      0.2.2      2018-03-29 CRAN (R 3.5.0)                 
##  blogdown      0.8        2018-07-15 CRAN (R 3.5.0)                 
##  bookdown      0.7        2018-02-18 CRAN (R 3.5.0)                 
##  cli           1.0.0      2017-11-05 CRAN (R 3.5.0)                 
##  clisymbols    1.2.0      2017-05-21 CRAN (R 3.5.0)                 
##  crayon        1.3.4      2017-09-16 CRAN (R 3.5.0)                 
##  digest        0.6.15     2018-01-28 CRAN (R 3.5.0)                 
##  dplyr       * 0.7.6      2018-06-29 CRAN (R 3.5.1)                 
##  emo           0.0.0.9000 2018-09-04 Github (hadley/emo@02a5206)    
##  evaluate      0.11       2018-07-17 CRAN (R 3.5.0)                 
##  fansi         0.2.3      2018-05-06 CRAN (R 3.5.0)                 
##  glue          1.3.0      2018-09-04 Github (tidyverse/glue@4e74901)
##  htmltools     0.3.6      2017-04-28 CRAN (R 3.5.0)                 
##  knitr         1.20       2018-02-20 CRAN (R 3.5.0)                 
##  lubridate     1.7.4      2018-04-11 CRAN (R 3.5.0)                 
##  magrittr      1.5        2014-11-22 CRAN (R 3.5.0)                 
##  pillar        1.3.0      2018-07-14 CRAN (R 3.5.0)                 
##  pkgconfig     2.0.1      2017-03-21 CRAN (R 3.5.0)                 
##  purrr       * 0.2.5      2018-05-29 CRAN (R 3.5.0)                 
##  R6            2.2.2      2017-06-17 CRAN (R 3.5.0)                 
##  Rcpp          0.12.18    2018-07-23 CRAN (R 3.5.0)                 
##  rlang         0.2.2      2018-08-16 cran (@0.2.2)                  
##  rmarkdown     1.10       2018-06-11 CRAN (R 3.5.0)                 
##  rprojroot     1.3-2      2018-01-03 CRAN (R 3.5.0)                 
##  rstudioapi    0.7        2017-09-07 CRAN (R 3.5.0)                 
##  sessioninfo   1.0.0      2017-06-21 CRAN (R 3.5.0)                 
##  stringi       1.2.4      2018-07-20 CRAN (R 3.5.0)                 
##  stringr       1.3.1      2018-05-10 CRAN (R 3.5.0)                 
##  tibble      * 1.4.2      2018-01-22 CRAN (R 3.5.0)                 
##  tidyselect    0.2.4      2018-02-26 CRAN (R 3.5.0)                 
##  utf8          1.1.4      2018-05-24 CRAN (R 3.5.0)                 
##  withr         2.1.2      2018-03-15 CRAN (R 3.5.0)                 
##  xfun          0.3        2018-07-06 CRAN (R 3.5.0)                 
##  yaml          2.2.0      2018-07-25 CRAN (R 3.5.0)

  1. Incidently this is a great way to label things. Don’t create a gender column with male or female (or other), but make the label the description: female TRUE/FALSE