How I Built a Recommendation Engine from Scratch

Recommendation systems are one of those engineering problems that feel larger than they are. Netflix, Spotify, Amazon — all built on variations of a few core ideas. This post walks through how I built one from first principles using PyTorch.

The core idea: matrix factorisation

At its simplest, a recommender is just a function: given a user and a catalogue of items, rank items by predicted preference. The cleanest way to do this is to learn latent embeddings for both users and items, then rank by dot-product similarity.

The model doesn't know what genre means. It learns that users who clicked A also clicked B, and infers a shared dimension from the pattern.

Setting up the training loop

The dataset is a user-item interaction matrix — rows are users, columns are items, values are implicit signals (clicks, plays, views). Negative sampling creates non-interaction pairs to train against.

class MatrixFactorisation(nn.Module):
    def __init__(self, n_users, n_items, dim=64):
        super().__init__()
        self.user_emb = nn.Embedding(n_users, dim)
        self.item_emb = nn.Embedding(n_items, dim)

    def forward(self, u, i):
        return (self.user_emb(u) * self.item_emb(i)).sum(1)

Results and reflections

After 20 epochs on 500K interactions, the model produces genuinely useful recommendations — much better than popularity-based baselines. The hardest part was not the model; it was getting clean interaction data.

3 Comments

Ananya Krishnan Apr 23, 2026

Really clear walkthrough. The negative sampling explanation finally made this click for me. Following for more ML content!

Rahul Menon Apr 23, 2026

How do you handle the cold-start problem when you have zero interaction data for a new user?

Gokula Prasanth Author Apr 23, 2026

Great question. For cold-start I fall back to content-based filtering — using item metadata similarity instead of collaborative signals. Once a user has 5+ interactions the collaborative model kicks in.

How I Built a Recommendation Engine from Scratch

The core idea: matrix factorisation

Setting up the training loop

Results and reflections

3 Comments

Leave a comment