These levels have been defined by the software carpentry people, and I have modified them to this:

  • beginner: You have just started out in this topic. You do not yet know how things are supposed to work. You do not have a mental model of this thing
  • intermediate: You are a regular user of this software/tool/concept, you have a mental model, but it is not very sophisticated
  • advanced: You have a sophisticated mental model how things work, and you even know when the model breaks, when it does not match reality.

A Model not in Production is a Waste of Money and Time

A Model not in Production is a Waste of Money and Time
I always push on people to make their ML project reach production. Even if it is not that good yet and even if you could eke out a bit more performance. I’ve been inspired by the dev-ops and lean movements and I hope you will be too. ML products have many ways to improve, you can always tweak more. But ML is high risk, with a possible high reward and relatively expensive compared to ‘normal code’. [Read More]

The Disney+ App Really Sucks

The Disney+ App Really Sucks
The disney+ app, really sucks. I have an Android device that hosts the app and I only every play on the chromecast. To be clear, I use a chromecast on a TV with CEC enabled. That is, you can send commands from your remote to connected devices. This is really nice, you can pauze, play, stop, rewind, toggle subtitles. And you can skip ahead, and back. There is even a button to accept things. [Read More]

So you've just lost a million dollars in the genAI hype

what lessons can you learn?

Hi C-level person! Are you feeling down because AI is not working for you? Let me know if this is you: A smug consultant sold you a genAI solution. By now you’ve realised that it doesn’t work, it can not work in theory and now it also doesn’t work in practice. You still have data quality issues, and your promised profits are non-existing. Are there any lessons you can learn from this fiasco? [Read More]

A rant about tp-link wifi boxes

No internet? no wifi for you!

A rant about tp-link wifi boxes
My internet was down for several days, (see previous post) and the only thing that really broke, except for obviously internet connected services on home-assistant, was the wifi. I have tp-link deco boxes and they work really okay for most of the time. They form a mesh and connect with whatever connection is best (through electricity, point to point wifi, or a network cable). In general, they just work. Until your internet connection is down. [Read More]

An offline first smart home is really nice

Local first smart home was a great decision

An offline first smart home is really nice
Recently my internet connection was down for several days, where I live in Europe that almost never happens. I am so used to having an internet connection that I really had to adjust to this. Of course having mobile phones with data connections means my online addiction was regularly fed, but I can’t roam my home lan over my mobile connection (yet?). Home assistant just kept going and that is awesome! [Read More]

ChatGPT in (the Core of) your Product is a Bad Idea

Foundational models are inherently risky.

ChatGPT in (the Core of) your Product is a Bad Idea
Google ~bard~1 gemini, Claude, or chatGPT seem to be able to do many things. They have easy APIs and many plugins. The price is lower than seems possible. And yet, integrating these things into your product is really risky. Here is why: Problems with foundational models These “AI’s”2 are build on foundational models. They are trained on massive amounts of text data, and finally finetuned for specific tasks. We don’t know what data was used for training. [Read More]

The art (and science) of feature engineering

combining best practices from science, and engineering

The art (and science) of feature engineering
Data scientists, in general, do not just throw data into a model. They use feature engineering; transforming input data to make it easy for the chosen machine learning algorithm to pick up the subtleties in the data. Data scientists do this so the model can predict outcomes better. In the image below you see a transformation of data into numeric values with meaning. In this article I’ll discuss why we still need feature engineering (FE) in the age of Large language models, and what some best practices are. [Read More]

Creating One Unified Calendar of all Data Science Events in the Netherlands

Over engineering with renv and github actions

Creating One Unified Calendar of all Data Science Events in the Netherlands
I enjoy learning new things about machine learning, and I enjoy meeting like minded people too. That is why I go to meetups and conferences. But not everyone I meet becomes a member of every group. So I keep sending my coworkers new events that I hear about here in the Netherlands. And it is easy to overlook a new event that comes in over email. Me individually cannot scale. So in this post I will walk you through an over engineered solution to make myself unnecessary. [Read More]

Gosset part 2: small sample statistics

Scientific brewing at scale

Simulation was the key to to achieve world beer dominance. ‘Scientific’ Brewing at scale in the early 1900s Beer bottles cheers This post is an explainer about the small sample experiments performed by William S. Gosset. This post contains some R code that simulates his simulations1 and the resulting determination of the ideal sample size for inference. If you brew your own beer, or if you want to know how many samples you need to say something useful about your data, this post is for you. [Read More]

William Sealy Gosset one of the first data scientists

The father of the t-distribution

I think William Sealy Gosset, better known as ‘Student’ is the first data scientist. He used math to solve real world business problems, he worked on experimental design, small sample statistics, quality control, and beer. In fact, I think we should start a fanclub! And as the first member of that fanclub, I have been to the Guinness brewery to take a picture of Gosset’s only visible legacy there. W. S. [Read More]