Don't Panic! a Scientific Approach to Debugging Production Failure

Your production system just broke down. What should you do now? Can you imagine your shiny application / flask app, or your API service breaking down? As a beginning programmer, or operations (or devops) person it can be overwhelming to deal with logs, messages, metrics and other possible relevant information that is coming at you at such a point. And when something fails you want it to get back to working state as fast as possible. [Read More]

WTF is Kubernetes and Should I Care as R User?

Fearless to production

I’m going to give you a high overview of kubernetes and how you can make your R work shine in kubernetes. Are you, an R-user in a company that uses kubernetes? building R applications (models that do predictions, shiny applications, APIs)? curious about this whole kubernetes thing that your coworkers are talking about? somewhat afraid? Then I have the post for you! Many R users come from an academic background, statistics and social sciences. [Read More]

Should I Move to a Database?

Long ago at a real-life meetup (remember those?), I received a t-shirt which said: “biggeR than R”. I think it was by microsoft, who develop a special version of R with automatic parallel work. Anyways, I was thinking about bigness (is that a word? it is now!) of your data. Is your data becoming to big? big data stupid gif Your dataset becomes so big and unwieldy that operations take a long time. [Read More]

Distributing data science products

Where or what is production? What does it mean when someone says to bring some data science product ‘in production’ ? What does it mean for data science products to be in production? Is your product already in production? Is it a magical place? I think two questions are of importance: does my ‘thing’ provide value? is my work repeatable? If the answer to these questions is yes, than your ‘thing’ is in production. [Read More]

UseR2021: Integrating R into Production

A view on UseR 2021

This year’s useR was completely online, and I watched many of the talks. I believe the videos will be public in the future but there were some talks that I wanted to highlight. I think that the biggest problem with machine learning- (or even data-) projects is the integration with existing systems. Many machine learning products are batch or real-time predictions. For those predictions to make value you will need: [Read More]

Walkthrough UbiOps and Tidymodels

From python cookbook to R {recipes}

In this walkthrough I modified a tutorial from the UbiOps cookbook ‘Python Scikit learn and UbiOps’, but I replaced everything python with R. So in stead of scikitlearn I’m using {tidymodels}, and where python uses a requirement.txt, I will use {renv}. So in a way I’m going from python cookbook to {recipes} in R! Components of the pipeline The original cookbook (and my rewrite too) has three components: [Read More]

Reasons to Use Tidymodels

I was listening to episode 135 of ‘Not so standard deviations’ - Moderate confidence The hosts, Hilary and Roger talked about when to use tidymodels packages and when not. Here are my 2 cents for when I think it makes sense to use these packages and when not: When not you are always using GLM models. (they are very flexible!) it makes no sense to me to go for the extra {parsnip} layer if you are always using the same models. [Read More]

Tidymodels on UbiOps

I’ve been working with UbiOps lately, a service that runs your data science models as a service. They have recently started supporting R next to python! So let’s see if we can deploy a tidymodels model to UbiOps! I am not going to tell you a lot about UbiOps, that is for another post. I presume you know what it is, you know what tidymodels means for R and you want to combine these things. [Read More]

Deploy to Shinyapps.io from Github Actions

Last week I spend a few hours figuring out how to auto deploy a shiny app on 2 apps on shinyapps.io from github. You can see the result on this github repository. This github repository is connected to two shiny apps on shinyapps.io. Here is what I envisioned, every new commit to the main branch will be published to the main app. We could then lock down the main branch so that no one can directly commit to main. [Read More]

Running an R Script on a Schedule: Azure Functions (Serverless)

timer-trigger in Azure Functions

In this post I will show how I run an R script on a schedule, by making use of ‘serverless’ computing service on the Microsoft Cloud called Azure Functions. In short I will use a custom docker container, install required software, install required r-packages using {renv} and deploy it in the Azure cloud. I program the process in azure such that the it runs once a day without any supervision. [Read More]