🎮 Steam Game Recommender API

Steam Game Recommender is a content-based recommendation system built with Python and FastAPI.
It analyzes Steam games and returns similar titles based on their attributes such as genres, tags, developers, and publishers.

The system uses TF-IDF vectorization combined with cosine similarity to generate meaningful recommendations.

⚡ Automated Lifecycle & Pipeline

The core of this project is its self-configuring pipeline. Upon startup, the system detects the environment and executes the following flow:

Auto-Data Acquisition: If the .csv or .pkl files are missing, the system automatically fetches the raw dataset.
Preprocessing & Cleaning:
- Normalizes release dates and standardizes boolean platforms (Win/Mac/Linux).
- Handles missing values and inconsistent numeric types.
Model Training:
- Vectorizes metadata using TF-IDF.
- Computes the Cosine Similarity matrix.
Persistence (Pickle): Saves the trained model to .pkl for near-instant API startup in subsequent runs.
Database Export (ETL): Uses SQLAlchemy to dump the cleaned, processed data into a MySQL database (XAMPP/phpMyAdmin), making it available for external queries or reporting.

🚀 Features

📊 Data preprocessing and cleaning of Steam dataset
🧹 Handling missing and inconsistent values
- Normalization of release dates
- Conversion of numeric fields (int/float)
- Boolean standardization (Windows / Mac / Linux support)
- Missing value handling
🎮 Content-based recommendation engine
- Uses game metadata (genres, tags, developers, publishers)
- Finds similarity between games
🧠 TF-IDF vectorization for feature extraction
📐 Cosine similarity for recommendation ranking
⚡ FastAPI REST API for real-time queries
💾 Pickle-based model persistence for faster startup

🧠 How Recommendations Work (TF-IDF)

The system converts game metadata into numerical vectors using TF-IDF (Term Frequency - Inverse Document Frequency).

📌 TF-IDF Formula

TF-IDF is calculated as:

Where:

TF(t, d) = frequency of term t in document d
DF(t) = number of documents containing term t
N = total number of documents

📊 Intuition

Words that appear frequently in a game description are important (TF)
Words that appear in many games are less useful (DF penalty)
Final score highlights unique and relevant features per game

🎯 Recommendation Process

Each game is transformed into a TF-IDF vector
Cosine similarity is computed between vectors

Cosine similarity:

similarity = (A · B) / (||A|| × ||B||)

:contentReference[oaicite:0]{index=0}

Games with highest similarity scores are returned

📈 Output

The API returns a JSON list of the most similar games:

Game title
Similarity score

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.vscode		.vscode
img		img
recommender		recommender
sqlAlchemy		sqlAlchemy
utilities		utilities
.env		.env
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎮 Steam Game Recommender API

⚡ Automated Lifecycle & Pipeline

🚀 Features

🧠 How Recommendations Work (TF-IDF)

📌 TF-IDF Formula

📊 Intuition

🎯 Recommendation Process

📈 Output

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎮 Steam Game Recommender API

⚡ Automated Lifecycle & Pipeline

🚀 Features

🧠 How Recommendations Work (TF-IDF)

📌 TF-IDF Formula

📊 Intuition

🎯 Recommendation Process

📈 Output

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages