BCA student at DIT University, Dehradun — developer, researcher, and photographer. My work sits at the intersection of computer vision, systems programming, and web development. I build things that have to actually work in production, not just on paper.
I come from an Android custom ROM maintenance background — which turns out to be a surprisingly solid foundation for low-level systems thinking. I'm also an avid space enthusiast and a firm believer that what you ship matters more than what degree it says on your transcript.
- 🔭 Currently working on: GlyphMotion (4K multi-object tracking pipeline) and Android custom ROM maintenance for the Nothing Phone 3a
- 🌱 Currently exploring: Python (ML/CV stack), C, Tailwind CSS, CUDA optimization
- 💬 Ask me about: Computer vision, video processing pipelines, Android internals, privacy, IoT, and network fundamentals
- 📫 Reach me: [email protected] · Instagram @shitijnotop · Telegram @shitijnotop
A production-grade 4K multi-object tracking pipeline built from scratch. Processes video through a 3-thread async architecture (reader / inference / writer) with bounded FIFO queues, YOLOv8m on CUDA, ByteTrack for ID-consistent tracking, and FFmpeg stdin-pipe encoding with full audio multiplexing. Outputs HEVC-encoded 4K video via NVENC.
Production benchmark (10-video suite):
| Metric | Value |
|---|---|
| VMAF (avg) | 97.03 (range: 88.6 – 99.997) |
| MOTA (avg) | 87–88 |
| Encoder | hevc_nvenc · CRF 24 · ultrafast |
Two validated operating regimes:
| Mode | Throughput | VMAF | MOTA | ID Switches | Jitter |
|---|---|---|---|---|---|
| Adaptive_Temporal (default) | 16.6 FPS | 94.91 | 79.19 | 5 | ~28 ms |
| Adaptive_ROI_Temporal | 12.9 FPS | 95.07 | 78.87 | 13 | — |
Adaptive_Temporal is the production default — better throughput, lower ID switch count, and ~28 ms frame jitter that stays within acceptable bounds for continuous content.
Key technical contributions:
-
HFDR (High-Frequency Detail Reinjection): Extracts HF components from the original frame and reinjects them post-encode at a tuned alpha strength (
Output = Processed + α × HF_components). Alpha sweep validated at a05 / a07 / a10. Moved VMAF from ~79 → 94–99 while preserving tracking quality. -
Compression-improves-tracking finding: CRF-based noise suppression measurably improves MOTA (56 → 83) — a result that contradicts standard assumptions in the tracking literature and is documented with citations in the paper.
-
VFR normalisation gate (
recon.py): Detects and corrects VFR, full-range pixel formats (yuvj420p), wrong colour spaces (BT.601 → BT.709), HDR, and interlacing before any thread launches. Critical for mobile-shot footage — a Nothing Phone 3a camera bug (VFR + yuvj420p + smpte170m) caused VMAF to collapse to 0.963 before this gate was introduced; after normalisation, VMAF recovered to 97.186. -
Inference at 1920px, output at 4K: TARGET_PROCESSING_WIDTH=1920 for CUDA inference; upscaled to 4K for final output, preserving detail via HFDR.
-
Cross-hardware validation: RTX 3050 (6 GB) vs GTX 1050 Ti — MOTA delta of 0.18, confirming the pipeline generalises across mid-range NVIDIA hardware.
Research paper (IEEE format, British English, 9 SVG figures, 7 tables, 16 references) co-authored with Sayan Sarkar and Dr. Kretika Goel (DIT University / IIT Delhi) — submission-ready as of April 2026.
Personal portfolio — web design, AI/ML projects, and visual storytelling. Built with vanilla HTML/CSS/JS; features a live moon phase calculator and photography showcase.
Photography portfolio — street and landscape shots taken exclusively on a phone. No DSLRs, no excuses.
Languages: Python · JavaScript · HTML/CSS · Bash · C
CV / ML: YOLOv8 (Ultralytics) · ByteTrack · PyTorch · CUDA · OpenCV · FFmpeg
Encoding / Video: HEVC/NVENC · libx264 · VMAF · FFprobe · VFR handling · async pipeline design
Web: Tailwind CSS · GitHub Pages · Cloudflare DNS · REST APIs
Android: Android SDK · Android NDK · ADB · Custom ROM maintenance (Nothing Phone 3a)
Tools: VS Code · Git · Linux (Ubuntu 24.04) · Android Studio
Dell G15 5530 · Intel i5-13450HX · RTX 3050 6 GB · 16 GB DDR5 · Ubuntu 24.04

