AI Engineering 11 min read

The Cold Start Problem: Session-Based Models, Content Fallback, and Exploration Budget

New users and new items have no history. GRU4Rec for session-based recommendations, content-based fallback using item metadata, exploration budget design, and the practical fix of using geographic/demographic priors as initialization.

The Cold Start Problem Is Three Different Problems

Cold start is treated as one problem but it's actually three, and each requires a different solution. New user cold start: a user who just installed the app has no interaction history. New item cold start: a new restaurant that just joined Swiggy has no order history, no ratings, no behavioral signals. System cold start: a brand new recommender system with no data at all — this is the bootstrapping problem that only matters at company founding.

Most interviews only ask about new user and new item cold start. These are genuinely hard because the two-tower model and collaborative filtering both require interaction data that doesn't exist yet. The question isn't whether cold start is hard — it's what you do in each phase, how you detect when a user or item has graduated out of cold start, and how you measure whether your cold start solution is working.

New User Cold Start

Phase 1 — onboarding signals. Explicitly ask the user: what cuisines do you like? What's your typical price range? Do you care more about speed or quality? Even 3-5 stated preferences dramatically improve the first session. The cost is friction; the gain is a meaningful first recommendation. Most apps ask during signup, but users often skip. Smart defaults (show the most popular restaurants in their delivery zone) provide a safe fallback.

Phase 2 — session signals. Within the first session, behavioral signals accumulate fast. The user scrolled past Italian but stopped on Thai — implicit preference signal. They clicked a restaurant and spent 2 minutes looking at the menu but didn't order — interest signal. They added an item to cart then removed it — indecision signal, possibly price sensitivity. A session-based recommender (GRU4Rec, BERT4Rec) updates recommendations in real time based on within-session behavior, without needing historical data.

# GRU4Rec: Session-based recommendation
class GRU4Rec(nn.Module):
    def __init__(self, n_items, hidden_size=100, embed_dim=50):
        super().__init__()
        self.item_embed = nn.Embedding(n_items, embed_dim, padding_idx=0)
        self.gru = nn.GRU(embed_dim, hidden_size, batch_first=True)
        self.output = nn.Linear(hidden_size, n_items)
    
    def forward(self, item_sequence):
        # item_sequence: (batch, seq_len) — items clicked in this session
        x = self.item_embed(item_sequence)
        output, hidden = self.gru(x)
        # Predict next item from last hidden state
        return self.output(output[:, -1, :])

Phase 3 — graduated user. After 5-10 interactions, the collaborative filtering signal becomes meaningful. The user can transition from cold start to standard two-tower retrieval. The transition threshold is a product decision: transition too early and recommendations are noisy; too late and you miss personalization opportunities.

New Item Cold Start

A new restaurant on Swiggy has photos, a menu, a location, a price range, and cuisine tags. It has no orders, no ratings, no interaction embeddings. The content-based approach: encode item features through the item tower without training on interactions — use the feature embedding directly. This gives a reasonable initial representation based on what the item is, even before any user has interacted with it.

Exploration budget: deliberately show new items to a small percentage of users who match the predicted audience. Measure actual interaction rates. This bootstraps interaction data while managing the cost of showing potentially bad recommendations. The exploration vs exploitation tradeoff here is explicit — you're spending some recommendation quality today to learn faster.

Detecting Cold Start Graduation

A user has graduated from cold start when their collaborative filtering predictions have lower uncertainty than the content-based fallback. In practice: track the user's interaction count and set a threshold (typically 5-15 interactions). For items: track order count and set a threshold (typically 10-50 orders). Below threshold → cold start treatment. Above threshold → full model.

Interview pattern: interviewers ask 'how do you handle cold start' expecting you to name session-based models or content-based fallback. The senior answer adds: how do you detect when a user has graduated from cold start, how do you measure whether your cold start solution is actually working (first-session retention, not just CTR), and what the cost is of getting it wrong in each direction.

Try it interactively

GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.

Open GenAI Systems Lab →