Running an R Script on a Schedule: Azure Functions (Serverless)

timer-trigger in Azure Functions

In this post I will show how I run an R script on a schedule, by making use of ‘serverless’ computing service on the Microsoft Cloud called Azure Functions. In short I will use a custom docker container, install required software, install required r-packages using {renv} and deploy it in the Azure cloud. I program the process in azure such that the it runs once a day without any supervision. [Read More]

Testing Azure Functions Locally with Azurite

Supplying secrets and simulating storage

I’ve been developing Azure Functions with R for the past week. There are some nice basic tutorials to run custom code on ‘Functions’, the basic tutorials all create a simple web app. That is, the docker container responds to http triggers. However, if you want to use a different trigger, you need to have a storage account too. There are two ways to do this: use the actual storage account you created on azure simulate storage with the ‘azurite’ container. [Read More]

TIL: Vectorization in Advent of Code Day 15

Indexing vectors is super fast!

I spend a lot of time yesterday on day 15 of advent of code (I’m three days behind I think). Advent of code is a nice way to practice your programming skills, and even though I think of myself as an advanced R programmer I learned something yesterday! The challenge is this: While you wait for your flight, you decide to check in with the Elves back at the North Pole. [Read More]

Stability, Portability and Flexibility Trade-offs

I think a lot about moving single R scripts from someone’s computer to the cloud (another computer). One of the major questions you need to answer is: Can I give my solution to someone else in a way that it ‘just’ works? R is an high level language. This allows you to write out the steps you want to take and that the actual implementation is hidden (can you imagine writing all the steps your computer needs to take? [Read More]

Rectangling (Social) Network Data, Advanced Options

Link features, for link prediction

This walkthrough is a follow up on my previous post about rectangling network data As a recap: we want to predict links between nodes in a graph by using features of the vertices. In the previous post I showed how to load flat files into a graph structure with {tidygraph}, how to select positive and negative examples, and I extracted some node features. Because we want to predict if a link between two nodes is probable, we can use the node features, but there is also some other information about the edges in the graph that we cannot get out with node features only procedure. [Read More]

Predicting links for network data

NETWORKS, PREDICT EDGES Can we predict if two nodes in the graph are connected or not? But let’s make it very practical: Let’s say you work in a social media company and your boss asks you to create a model to predict who will be friends, so you can feed those recommendations back to the website and serve those to users. You are tasked to create a model that predicts, once a day for all users, who is likely to connect to whom. [Read More]

Rectangling (Social) Network Data

Preparing data for link prediction

In this tutorial I will show you how we go from network data to a rectangular format that is suited for machine learning. Many things in the world are graphs (networks). For instance: real-life friendships, business interactions, links between websites and (digital) social networks. I find graphs (the formal name for networks) fascinating, and because I am also interested in machine learning and data engineering, the question naturally becomes: How do I get (social) network data into a rectangular structure for ML? [Read More]

Running an R Script on a Schedule: Overview

There are lots of rstats tutorials about creating beautiful plots, setting up shiny applications and even a few on setting up plumber APIs (but we could use more). However a lot of work consists of running a script without any interaction. This is an overview page for the tutorials I’ve created so far. This overview is for you if you want to know how to run your batch script (do one thing without supervision) automatically. [Read More]

Running an R Script on a Schedule: Docker Containers on gitlab

In this tutorial/howto I show you how to run a docker container on a schedule on gitlab. Docker containers are awesome because, once made, they run everywhere! It does not matter what type of computer (Though I believe there is a problem with ARM based vs other CPU’s). you have. Once I build a container you can run my container on a linux box, windows machine or mac. This is also why people love containers for production, you can finally truly pick up a container from development and hand it over to production. [Read More]

Running an R Script on a Schedule: Gh-Actions

Tweeting from github actions

In this tutorial I have an R script that runs every day on github actions. It creates a curve in ggplot2 and posts that picture to twitter. The use case is this: You have a script and it needs to run on a schedule (for instance every day). Other ways to schedule a script I will create a new post for many of the other ways on which you can run an R script on schedule. [Read More]