💿 TrackBack

TrackBack is a custom Deep Learning pipeline built in PyTorch that maps text queries to their corresponding source documents. It uses a hybrid architecture, leveraging Transfer Learning for feature extraction and a custom neural network for classification.

Currently, the frontend is themed around an "ABBA Lyric Finder," but the backend is completely generalized and can be trained on any text dataset (legal documents, medical records, FAQs, etc.).

🧠 Architecture

This project implements a multi-stage Machine Learning pipeline:

Feature Extraction: Uses the Hugging Face all-MiniLM-L6-v2 SentenceTransformer to convert text into 384-dimensional dense semantic embeddings.
Custom Classifier: A PyTorch Feed-Forward Neural Network (FFNN) containing:
- Input layer with Gaussian Noise (0.05) for extreme regularization during training.
- Hidden Layer (128 neurons) with Batch Normalization and ReLU activation.
- Dropout Layer (p=0.5) to prevent overfitting.
- Output Layer dynamically sized to the number of classes.
Optimization: Trained using the Adam optimizer with Weight Decay (1e-3) and a ReduceLROnPlateau Learning Rate Scheduler.
Validation: Implements an 80/20 train/test split, tracks Top-5 Accuracy, and features Early Stopping to automatically save the weights of the best-performing epoch.

📂 Project Structure

TrackBack/
│
├── scrapper/
│   ├── songs.txt               # List of songs to scrape from Genius Lyrics
│   └── scraper.py              # Script to scrape the data
├── data/                       # Directory for storing raw .txt files (one file per song)
├── trackback_weights.pth       # Saved model weights (Generated after training)
├── learning_curve.png          # Plotted loss metrics (Generated after training)
├── genius_api.env              # Environment file for Genius API credentials (Git ignored)
├── example_genius_api.env      # Example template for setting up API credentials
├── requirements.txt            # Python package dependencies
├── .gitignore                  # Ignored files configuration
├── README.md                   # Project documentation
│
└── src/
    ├── data_prep.py            # Data loading, cleaning, tokenization, and splitting
    ├── model.py                # PyTorch architecture and training loop definition
    ├── train.py                # Offline execution script for training and plotting
    ├── UiConfig.py             # UI configuration, styling, and YouTube links mapping
    └── app.py                  # Streamlit web application for deployment

🚀 Setup & Execution

1. Installation

Clone the repository and install the required dependencies using pip:

pip install -r requirements.txt

2. Scraping Lyrics (Optional)

If you wish to augment the dataset, you must configure the Genius Lyrics API:

Copy example_genius_api.env to a new file named genius_api.env.
Add your Genius API credentials (GENIUS_CLIENT_ID, GENIUS_CLIENT_SECRET, GENIUS_ACCESS_TOKEN).
Add any desired songs to scrapper/songs.txt.
Run the scraper:
```
python scrapper/scraper.py
```

Note

Artist Filter: By default, scrapper/scraper.py contains a strict condition that only downloads and saves lyrics explicitly matched to the artist "ABBA". If you are training TrackBack on a different artist or a generic dataset, open scrapper/scraper.py and adjust or remove the "ABBA" condition to suit your preference!

3. Training the Model

Execute the offline training pipeline. This will process the data, extract embeddings, train the neural network, and generate the learning curve plot:

python src/train.py

4. Running the Web App

Launch the interactive Streamlit dashboard:

streamlit run src/app.py

5. TrackBack Web App Demo

Trackback Web App

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💿 TrackBack

🧠 Architecture

📂 Project Structure

🚀 Setup & Execution

1. Installation

2. Scraping Lyrics (Optional)

3. Training the Model

4. Running the Web App

5. TrackBack Web App Demo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
scrapper		scrapper
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_genius_api.env		example_genius_api.env
learning_curve.png		learning_curve.png
requirements.txt		requirements.txt
trackback_weights.pth		trackback_weights.pth

Folders and files

Latest commit

History

Repository files navigation

💿 TrackBack

🧠 Architecture

📂 Project Structure

🚀 Setup & Execution

1. Installation

2. Scraping Lyrics (Optional)

3. Training the Model

4. Running the Web App

5. TrackBack Web App Demo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages