ReelSense

An Intelligent Hybrid Movie Recommendation System
Combining Collaborative Filtering + Content-Based Features with Explainability

Features • Tech Stack • Quick Start • How It Works • Metrics • Structure

Features

Feature	Description
Hybrid Recommendations	Combines collaborative filtering with content-based methods for superior accuracy
Smart Explainability	Every recommendation comes with a human-readable explanation
Comprehensive Evaluation	Includes Precision@K, Recall@K, diversity, and coverage metrics
Popularity-Based Fallback	Cold-start handling with weighted popularity scoring
Long-Tail Analysis	Visualizes and addresses item popularity distribution
Leave-Last-N Split	Proper temporal train-test splitting for recommender systems

Tech Stack

Python

Pandas

NumPy

Scikit-learn

Jupyter

Quick Start

Prerequisites

Python 3.8+
pip / pip3
Git

Installation

Step 1: Clone the Repository

git clone https://github.com/sciguy-code/reelsense.git
cd reelsense

Step 2: Create Virtual Environment

macOS

python3 -m venv venv
source venv/bin/activate

Linux (Ubuntu/Debian)

# Install venv if not available
sudo apt install python3-venv

# Create and activate
python3 -m venv venv
source venv/bin/activate

Windows (PowerShell)

python -m venv venv
.\venv\Scripts\Activate.ps1

Note: If you get an execution policy error, run:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Windows (Command Prompt)

python -m venv venv
venv\Scripts\activate.bat

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Launch Jupyter Notebook

jupyter notebook reel_sense.ipynb

Tip: If jupyter is not found, install it with:
pip install jupyter

How It Works

Architecture Overview

flowchart TD
    subgraph Pipeline["ReelSense Pipeline"]
        A["Data Loading<br/>(CSV Files)"] --> B["Data Cleaning<br/>& Processing"]
        B --> C["Train/Test<br/>Split"]
        C --> D
        
        subgraph D["Hybrid Recommender Engine"]
            E["Collaborative Filtering<br/>(Cosine Similarity)"] 
            F["Content-Based<br/>(TF-IDF on Genres)"]
        end
        
        D --> G["Explainability Layer<br/>\"Recommended because you liked X\""]
    end
    
    style Pipeline fill:#1a1a2e,stroke:#16213e,color:#fff
    style D fill:#0f3460,stroke:#e94560,color:#fff
    style G fill:#533483,stroke:#e94560,color:#fff

Algorithm Details

Step 1: Data Preprocessing

Timestamp Conversion: Unix timestamps to datetime objects
Year Extraction: Regex-based extraction from movie titles
Genre Parsing: Pipe-separated genres to lists
Noise Filtering:
- Movies with < 5 ratings removed
- Users with < 20 ratings removed

Step 2: Collaborative Filtering

Uses Item-Item Similarity computed via cosine similarity:

similarity = item_vectors @ item_vectors.T
cf_scores = user_ratings @ similarity

Step 3: Content-Based Filtering

Leverages TF-IDF Vectorization on genre metadata:

tfidf_matrix = TfidfVectorizer().fit_transform(genres)
genre_sim = cosine_similarity(tfidf_matrix)

Step 4: Hybrid Score Blending

final_score = α × CF_score + (1 - α) × Content_score

Where α = 0.7 (adjustable weight parameter)

Evaluation Metrics

Model Performance

K Value	Precision@K	Recall@K
K=5	0.63%	3.16%
K=10	0.50%	4.98%
K=20	0.47%	9.30%

Diversity & Coverage

Metric	Value
Catalog Coverage	6.20%
Unique Items Recommended	604

Comparison: Hybrid vs Popularity-Based

Hybrid Model (K=10):      Precision: 0.50%  |  Recall: 4.98%
Popularity-Based (K=10):  Precision: 0.25%  |  Recall: 2.49%

Hybrid Model Improvement: ~2x better performance

Project Structure

reelsense/
├── reel_sense.ipynb      # Main Jupyter notebook with full implementation
├── requirements.txt      # Python dependencies
├── README.md             # Project documentation
├── ml-latest-small/      # MovieLens dataset
│   ├── movies.csv        # Movie metadata (9,742 movies)
│   ├── ratings.csv       # User ratings (100,836 ratings)
│   ├── tags.csv          # User-generated tags
│   └── links.csv         # External database links
└── venv/                 # Virtual environment (not tracked)

Dataset

This project uses the MovieLens ml-latest-small dataset:

Statistic	Value
Total Ratings	100,836
Total Movies	9,742
Total Users	610
Rating Scale	0.5 - 5.0
Time Range	1996 - 2018

Data Source

F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4: 19:1–19:19.

Sample Recommendations

For User #1 (Action/Sci-Fi Enthusiast):

Movie	Genres	Explanation
Fifth Element, The	Action\|Adventure\|Comedy\|Sci-Fi	You liked 'Star Wars: Episode VI', sharing Action, Sci-Fi, Adventure
Austin Powers	Action\|Adventure\|Comedy	You liked 'Austin Powers: International Man of Mystery', sharing Action, Comedy, Adventure
Die Hard	Action\|Crime\|Thriller	You liked 'Shaft', sharing Action, Thriller, Crime
Aliens	Action\|Adventure\|Horror\|Sci-Fi	You liked 'Star Wars: Episode VI', sharing Action, Sci-Fi, Adventure
Hunt for Red October	Action\|Adventure\|Thriller	You liked 'Dr. No', sharing Action, Thriller, Adventure

Future Improvements

Matrix Factorization (SVD/ALS)
Deep Learning approaches (Neural Collaborative Filtering)
Web interface with Flask/FastAPI
Real-time recommendation updates
Mobile-responsive frontend

Authors

Made with ❤️ for BrainDead 2026 Hackathon

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made by team CrackHeads for BrainDead 2026 Hackathon.Star this repo if you found it helpful!

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ml-latest-small		ml-latest-small
.gitignore		.gitignore
README.md		README.md
reel_sense.ipynb		reel_sense.ipynb
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ReelSense

Features

Tech Stack

Quick Start

Prerequisites

Installation

Step 1: Clone the Repository

Step 2: Create Virtual Environment

Step 3: Install Dependencies

Step 4: Launch Jupyter Notebook

How It Works

Architecture Overview

Algorithm Details

Step 1: Data Preprocessing

Step 2: Collaborative Filtering

Step 3: Content-Based Filtering

Step 4: Hybrid Score Blending

Evaluation Metrics

Model Performance

Diversity & Coverage

Comparison: Hybrid vs Popularity-Based

Project Structure

Dataset

Data Source

Sample Recommendations

For User #1 (Action/Sci-Fi Enthusiast):

Future Improvements

Authors

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages