Rectangling (Social) Network Data, Advanced Options

Link features, for link prediction

This walkthrough is a follow up on my previous post about rectangling network data As a recap: we want to predict links between nodes in a graph by using features of the vertices. In the previous post I showed how to load flat files into a graph structure with {tidygraph}, how to select positive and negative examples, and I extracted some node features. Because we want to predict if a link between two nodes is probable, we can use the node features, but there is also some other information about the edges in the graph that we cannot get out with node features only procedure. [Read More]

Predicting links for network data

NETWORKS, PREDICT EDGES Can we predict if two nodes in the graph are connected or not? But let’s make it very practical: Let’s say you work in a social media company and your boss asks you to create a model to predict who will be friends, so you can feed those recommendations back to the website and serve those to users. You are tasked to create a model that predicts, once a day for all users, who is likely to connect to whom. [Read More]

Rectangling (Social) Network Data

Preparing data for link prediction

In this tutorial I will show you how we go from network data to a rectangular format that is suited for machine learning. Many things in the world are graphs (networks). For instance: real-life friendships, business interactions, links between websites and (digital) social networks. I find graphs (the formal name for networks) fascinating, and because I am also interested in machine learning and data engineering, the question naturally becomes: How do I get (social) network data into a rectangular structure for ML? [Read More]