Cleaning up and combining data, a dataset for practice

tldr: I created an open dataset for the explicit practice of data munging. Feel free to use it in assignments, but do mention where you got it from (CC-by-4.0). Also unicorns are awesome. Find the dataset at: https://github.com/RMHogervorst/unicorns_on_unicycles Data munging / cleaning / engineering At work I was working with a two excel files that were slightly different but could be combined into 1 dataset. This is very typical for day to day cleaning operations that analysts and data scientists do (statisticians too). [Read More]

Introducing Badgecreatr, a package that places badges in your readme

Introducing Badgecreatr, a package to create and place badges in your readme.Rmd file on Github. Badgecreatr will create the following badges (aka shields): Installation Install the package with install.packages("badgecreatr") How do you use badgecreatr? Badgecreatr has one main function: badgeplacer(). The most simple command is: badgecreatr::badgeplacer( githubaccount = "yourgithubname",githubrepo = "yourpackagename", branch = "master") If your project is in its infancy and you don’t want people to use it yet: [Read More]

Portioning projects

Often we write programs to automate things. The programs range from simple to complex. But in essence, you always do the same thing: You are trying to solve a problem. A common pitfall, at least for me, is that you start out to big. What you need to do is start simple and small, and only if your simple thing works, increase the complexity. Separate parts of the program need to be separate functions. [Read More]