🗺️

Embeddings in Recommendation Systems

Mapping users, items, and context into one vector space

The most fundamental question in RecSys is "will this user like this item?" Embeddings turn this into a distance problem in vector space.

If user and item vectors are close, the user likely prefers it. This is the core of Two-Tower models, and Matrix Factorization is essentially the same idea.

Types of embeddings

ID embeddings: Learnable vectors for user IDs and item IDs. The most basic but weak against cold start.

Feature embeddings: Vectorize attributes like category, tags, price range. The Deep part of Wide&Deep does this.

Sequence embeddings: Encode entire behavior sequences into one vector. GRU4Rec, BERT4Rec fall here.

Context embeddings: Situational info like time, location, device. The same user wants different things on a commute vs. a weekend.

How It Works

Define features for User/Item/Context

Vectorize each feature via Embedding Layer

Combine vectors (concat/attention) into a unified representation

Compute matching score via dot product or cosine similarity

Pros

✓ Unifies heterogeneous data (text, image, behavior) in one space
✓ Millisecond serving via ANN index

Cons

✗ Requires hyperparameter tuning (dimensions, learning rate)
✗ Embedding drift over time — periodic retraining required

Use Cases

User/item towers in Two-Tower models ANN (Approximate Nearest Neighbor) search

👥

Collaborative Filtering

"People with similar taste liked this"

→

← 🛒 Item2Vec 👥 Collaborative Filtering →