22

Recommender Systems

Neighborhood-Based Collaborative Filtering

Theory

Collaborative filtering makes recommendations based on preferences of similar users (user-based) or similar items (item-based). It doesn't require item features, only user-item interaction data.

Visualization

Neighborhood-Based Collaborative Filtering visualization

Mathematical Formulation

User-Based CF:
r̂ᵤᵢ = r̄ᵤ + (Σ sim(u,v)·(rᵥᵢ - r̄ᵥ)) / Σ |sim(u,v)|

Item-Based CF:
r̂ᵤᵢ = (Σ sim(i,j)·rᵤⱼ) / Σ |sim(i,j)|

Similarity Metrics:
• Cosine Similarity
• Pearson Correlation
• Jaccard Similarity

Code Example

import numpy as np
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

# User-item rating matrix
ratings = pd.DataFrame({
    'user_id': [1, 1, 1, 2, 2, 3, 3, 3, 4, 4],
    'item_id': [1, 2, 3, 1, 2, 2, 3, 4, 1, 4],
    'rating': [5, 4, 3, 4, 5, 5, 4, 5, 3, 4]
})

# Create user-item matrix
user_item_matrix = ratings.pivot(
    index='user_id', columns='item_id', values='rating'
).fillna(0)

print("User-Item Matrix:")
print(user_item_matrix)

# Calculate user-user similarity
user_similarity = cosine_similarity(user_item_matrix)

print("\nUser Similarity Matrix:")
print(pd.DataFrame(user_similarity, 
                  index=user_item_matrix.index,
                  columns=user_item_matrix.index))

# Predict rating for user 1, item 4
# (not yet rated by user 1)
user_idx = 0  # user 1
item_idx = 3  # item 4

similar_users = user_similarity[user_idx]
item_ratings = user_item_matrix.iloc[:, item_idx]

# Weighted average
weighted_sum = np.sum(similar_users * item_ratings)
similarity_sum = np.sum(np.abs(similar_users))

predicted_rating = weighted_sum / similarity_sum
print(f"\nPredicted rating for User 1, Item 4: {predicted_rating:.2f}")