Sequential Recommender Systems Walk-Through

August 10, 2025 in machine-learning, recommendation-systems3 minutes

By modeling the same example, compare the difference of all SRS related methods

Introduction

I am blessed to have a cat baby, and I enjoy baking a lot, however recently I started to lose weight. So my amazon records looks like this:

  • Months 1–6: lots of cat stuff (litter mat, fountain, scratching post, treats, toys, carrier).
  • Months 7–9: baking accessories (sheet pans, silicone mats, piping bags, cake turntable, mixer bowl).
  • Last 2 weeks: workout kick (Theragun → whey protein → lifting gloves).

Apparently my purchases have sequential dependencies, which are unable to be captured by conventional recommendation systems, including collaborative filtering and content-based filtering, as they model/depict consumers by their interaction to the items and it’s order-agnostic and it’s also just pair-wise correlation between the consumer and items based on engagements (clicks, conversions etc).

And to model such sequential dependencies, there are a lot of different models from non-DNN, DNN, to latest LLM. This doc summarizes how each model works for my user behaviors. For difficulties/challenges/characteristics of sequential recommender, please refer to Sequential Recommender Systems: Challenges, Progress and Prospects

Collaborative Filtering - Matrix-factorization CF

Matrix-Factorization CF learns a latent vector $p_u$ for each user. Although it’s not frequency by category, but in practice, $p_u$ ends up to be:

mathematical expression or equation

Here:

  • mathematical expression or equation is the size of items; $k$ is item factor dimensions.
  • mathematical expression or equation : Item factor matrix. Row $i$ is the $k$-dim vector for item $i$.
  • mathematical expression or equation : A diagonal confidence matrix (all non-diagonal = 0) of user mathematical expression or equation for items. mathematical expression or equation is the confidence score of user $u$ with item $i$. It’s normally derived from engagement data mathematical expression or equation (the interaction #clicks, #conversions):
mathematical expression or equation

The term mathematical expression or equation is some sort of normalization, while $C_u$ and $r_u$ are both related to consumer engagement, therefore the learned consumer representations end up engagement (clicks/conversions) weighted item vectors. So in my case, my representation will be closest to cat supplies embeddings. Therefore when predict the next items, it will likely end up cat supplies.

Before DNN - Sequential pattern mining

Mine frequent patterns on sequence data, and then utilize the patterns for subsequent recommendations. Although simple and straightforward, but the patterns could be redundant. e.g. I am buying cat supplies monthly, while sometimes buying bake supplies in between. The pattern could be something like cat_food -> cat_litter -> baking_supplies.

Basic Markov Chain

The hypothesis is future purchase depends only on previous $k$ purchases. And a transition matrix will be learned, with each value represent how often from state mathematical expression or equation to mathematical expression or equation , and then run a row normalization (each row sum = 1).

  • First order chain is to compute mathematical expression or equation
  • High order chain is to compute mathematical expression or equation
    • And for high order chain, the predicted probability is:
    mathematical expression or equation

The trainsaction matrix is something like:

mathematical expression or equation

Since my last item is lifting glove, the prediction will be the one with largest probability in transaction matrix row mathematical expression or equation .

Latent Markov Embedding based approach

Deep Learning Era

RNN-based approaches

Recurrent Neural Networks can capture longer-term dependencies in sequential data, making them more suitable for modeling complex user behavior patterns.

Attention-based models

Modern attention mechanisms can focus on relevant parts of the purchase history when making recommendations.

LLM Wave

Large Language Models for recommendation

Recent advances in LLMs show promise for understanding complex user preferences and generating personalized recommendations based on natural language descriptions of user behavior.

Conclusion

Sequential recommendation systems offer a more nuanced understanding of user behavior by considering the temporal order of interactions. While traditional methods focus on static user-item relationships, sequential models can capture evolving preferences and behavioral patterns over time.