Upload any research paper PDF → get instant summaries, Q&A, interview prep, implementation roadmaps, and critical analysis — powered by RAG + LLaMA 3.1.
PDF Upload
→ Text Extraction (PyMuPDF)
→ Chunking (LangChain RecursiveCharacterTextSplitter, 500 tokens, 50 overlap)
→ Embedding (SentenceTransformers all-MiniLM-L6-v2)
→ FAISS Indexing (IndexFlatL2, top-5 semantic retrieval)
→ LLM Generation (Groq LLaMA 3.1 8B)
| Task | Description |
|---|---|
| 📄 Summarize | Main contributions, methodology, key results |
| 💬 Q&A | Ask any question, answered from paper context only |
| 🎯 Interview Prep | 5 technical Q&A pairs based on paper content |
| 🗺️ Implementation Roadmap | Step-by-step PyTorch reproduction guide |
| 🔬 Critical Analysis | Strengths, limitations, future directions |
PaperLens/
├── app.py ← Gradio UI entry point
├── pipeline.py ← RAG pipeline logic
├── requirements.txt
└── README.md
git clone https://github.com/Vi-bha/PaperLens
cd PaperLens
pip install -r requirements.txt
export GROQ_API_KEY=your_key_here # free at console.groq.com
python app.pyOpen http://localhost:7860 in your browser.
| Component | Technology |
|---|---|
| PDF Parsing | PyMuPDF (fitz) |
| Chunking | LangChain RecursiveCharacterTextSplitter |
| Embeddings | SentenceTransformers all-MiniLM-L6-v2 |
| Vector Store | FAISS IndexFlatL2 |
| LLM | Groq LLaMA 3.1 8B Instant |
| UI | Gradio 4.x |
| Project | Description | Demo |
|---|---|---|
| ResearchMind | Autonomous AI scientist querying PubMed 35M+ papers | 🤗 Live |
| MedLens | Multimodal AI for prostate MRI analysis | 🤗 Live |
Vibhavari Tummewar — M.Tech Advanced Computing, MANIT Bhopal Scopus-indexed researcher in LLM-assisted medical AI systems.