Reading in your training data

Data Ingestion Patterns for ML

How do you get your training data into your model? Most tutorials and kaggle notebooks begin with reading of csv data, but in this post I hope I can convince you to do better. I think you should spend as much time as possible in the feature engineering and modelling part of your ML project, and as little time as possible on getting the data from somewhere to your machine. [Read More]

Quick post - detect and fix this ggplot2 antipattern

Recently one of my coworkers showed me a ggplot and although it is not wrong, it is also not ideal. Here is the TL:DR : Whenever you find yourself adding multiple geom_* to show different groups, reshape your data In software engineering there are things called antipatterns, ways of programming that lead you into potential trouble. This is one of them. I’m not saying it is incorrect, but it might lead you into trouble. [Read More]