These levels have been defined by the software carpentry people, and I have modified them to this:

  • beginner: You have just started out in this topic. You do not yet know how things are supposed to work. You do not have a mental model of this thing
  • intermediate: You are a regular user of this software/tool/concept, you have a mental model, but it is not very sophisticated
  • advanced: You have a sophisticated mental model how things work, and you even know when the model breaks, when it does not match reality.

A Trusted Certificate for your Homelab Sites

A Trusted Certificate for your Homelab Sites
In this post I will describe how you can set up services in kubernetes that will listen for new ingress, create a certificate, get it signed by letsencrypt and presented on the correct website. It will also automatically update DNS records. In my previous post I described how you can create .local websites on your homelab, and how you can use mDNS to have the sites working without extra configuration. But mDNS is served over http, and many browsers feel slower with http compared to https. [Read More]

Making your Homelab Apps Available under a .local Domain

A touch of Traefik, externalDNS

Making your Homelab Apps Available under a .local Domain
I created a few applications on my homelab, one of those is music-assistant. Music-assistant is an awesome project that plays all your local and remote music over all your speakers. It can play Spotify, ripped cds, webradio stations etc., and it will play them over smart and dumb speakers. It also integrates with home-assistant. Anyways, what I want to talk about today is making services available on your home network. [Read More]

Hacking `/etc/ssl/certs/` with Containers in Corporate Networks

Hacking `/etc/ssl/certs/` with Containers in Corporate Networks
As a consultant I come into different organizations, usually of the larger size. Making my custom applications work in those orgs, often revolves around TLS certificates. This post explains how you can add custom certificates, but also how you can skip that part by injecting certificates into a pod. Self-signed certificates in large orgs If you work in open environments you never have to think about this, but companies of a certain size start to build a large (internal) intRAnet with custom pages and custom domains. [Read More]

Many Small Models for Speed

Many Small Models for Speed
LLMs are pretty cool, but they are massive! If you want to run those for yourself you need a hefty GPU and quite a lot of engineering. But the world of machine learning is so much bigger then LLMs. In organizations all over the world, there are models forecasting time-series, predicting prices, creating embeddings, classifying categories and what not. If you have several prediction/classification steps that combine into one end- result, you could consider training one bigger model that does all of the things. [Read More]

Message Broker Pattern for ML Systems

Message Broker Pattern for ML Systems
I’ve seen a pattern in different places but it is most useful for streaming data. Data that comes in over time, with quite some volume. The core of the solution is a message broker, this could be light weight like redis1, or a heavier log-like solution like kafka2. In stead of sending data from one microservice to another through API calls, we publish data to a central place, and services subscribe to data, and publish their results back (that is why it is called PubSub; publish - subscribe). [Read More]

OpenSanctions is an amazing example of entity resolution at scale

OpenSanctions is an amazing example of entity resolution at scale
In one of my previous post I talked about entity resolution and how data science plays a role. I am a big fan of OpenSanctions, and their process (entirely open) is a beautiful example of Entity resolution. OpenSanctions is an international database of persons and companies of political, criminal, or economic interest. It’s a combination of sanction lists, publicly available databases, and criminal information. Companies can use this information to check their customers, to prevent money laundering and sanction evasion. [Read More]

Entity resolution for data scientists

or data matching, or data deduplication or record linkage

Entity resolution for data scientists
I have a problem. Others have it too, it is a problem of duplication. I’m trying to track the books I read in Bookwyrm so I can talk about it online. But there are so many duplicates! How do we know if Soren Kierkegaard,Søren Kierkegaard, and Sören Kierkegaard are the same person? This is an example of entity resolution1. It is also called deduplication, record linkage and data matching 2. We want to compare entities from different datasets and make a confident claim that they match or not. [Read More]

Adaptive Plasticity and Life History Theory

April cools post

Happy April 1st! This post is part of April Cools Club: an April 1st effort to publish genuine essays on unexpected topics. I want to tell you about a fascinating topic of adaptive plasticity and life history theory. I haven’t read anything about this anymore since 2014 but the ideas have kept a place in my head (lived there rent free? a weird expression). This is also a free day for me, so I’m going to put minimal effort in writing about this topic, I am going to write without consulting even wikipedia. [Read More]

Pytorch on an AMD gpu (frame.work 13)

I have a frame.work laptop. it is really nice! it looks awesome and is easily repairable. I chose an AMD type, which as an integated GPU. the AMD Ryzen 7 7840U You can actually use this GPU with pytorch! But you need to perform a few steps, I write them down here for future use. (I’m using ubuntu on this device) allocate more VRAM to GPU with a bios setting (go into bios and change setting GPU to gaming mode or something, see this link) start a virtual environment in your project install the right versions of pytorch packages; go to https://pytorch. [Read More]