Skip to content
View ms1104n-max's full-sized avatar

Block or report ms1104n-max

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ms1104n-max/README.md

Sai Nikhil Mattapalli

AI / ML Engineer · Healthcare ML · RAG · Generative AI

Shipping ML systems at the edge of healthcare, retrieval, and reliability.

📍 New Jersey, USA  ·  🎯 Open to senior AI/ML roles

🌐 Portfolio  ·  📫 Email  ·  💼 LinkedIn


📊 By the numbers

5+ yrs 22% 15% 45% 60%
Production AI/ML RAG latency
reduction
Model accuracy
lift
Faster API
responses
Integration
efficiency

👋 About

I'm an AI/ML engineer with 5 years of production experience building ML and Generative AI systems. Currently at Molina Health, designing risk-stratification models, RAG-driven care insights, and decision-support tools for Medicaid and Medicare populations.

Before Molina, I spent two and a half years at Cognizant building backend ML services, embedding pipelines, REST APIs on FastAPI/Flask, and the analytics plumbing that makes any of it possible. M.S. in Computer Science from SUNY Albany.

My work sits where modeling meets engineering: latency budgets, evaluation that survives production, and pipelines that don't fall over when the data changes underneath them. Offline accuracy is the start of the job — shipping is the rest.


🗓️ Career timeline

gantt
    title       Industry experience
    dateFormat  YYYY-MM
    axisFormat  %Y
    section Roles
    Software Engineer  ·  Cognizant                       :done,    c1, 2020-02, 2022-08
    AI / ML Engineer  ·  Molina Health  (current)         :active,  m1, 2023-08, 2026-05
Loading

🧭 Featured architecture  —  multi-agentic-rag

A multi-agent retrieval-augmented system where specialized agents handle routing, retrieval, query reformulation, fact-checking, and safety. Built with LangChain + Streamlit + Groq, hybrid file/URL knowledge sources, dynamic routing and self-correction.

flowchart LR
    Q([User query]) --> R{{Router agent}}
    R -- "files / URLs" --> Ret[Retriever]
    R -- "fresh facts" --> Web[Web search agent]
    Ret --> Rf[Query reformulator]
    Web --> Rf
    Rf --> Fc[Fact-checker agent]
    Fc --> Sc[Safety-checker agent]
    Sc --> Out([Grounded response])

    classDef agent fill:#1f6feb22,stroke:#1f6feb,stroke-width:1.5px,color:#1f6feb
    classDef io fill:#d4ff3a22,stroke:#8ba526,color:#3d3a35
    class R,Ret,Web,Rf,Fc,Sc agent
    class Q,Out io
Loading

github.com/ms1104n-max/multi-agentic-rag


🚀 Currently building

Repo What it actually does
multi-agentic-rag Multi-agent RAG with router · retriever · reformulator · web-search · fact-checker · safety agents (Streamlit + LangChain + Groq)
rag-chatbot Production-aware RAG chatbot · CI/CD · M1 / NVIDIA llama.cpp · explicit cost / latency / hallucination tradeoffs
rag-from-scratch RAG without framework abstractions — embeddings, local vector DB, retrieval, re-ranking, query rewriting from first principles
langchain-rag-document-understanding LangChain + FAISS + SentenceTransformer pipeline for grounded document Q&A · Jupyter walkthrough
mlops-app IaC reference stack — Terraform on GCP · BigQuery · GH Actions · Docker · Prefect · dbt · MLflow · FastAPI

📖 Studying / extending


💼 Selected work impact

  • 22% lower latency  ·  production RAG pipelines for real-time care-management insights  —  Molina Health
  • 15% accuracy lift  ·  risk-adjustment and utilization-prediction models  —  Molina Health
  • 30% faster  ·  claims, eligibility, and provider data retrieval on AWS S3 + Snowflake  —  Molina Health
  • 45% faster  ·  API response times across FastAPI / Flask services  —  Cognizant
  • 60% efficiency lift  ·  cross-service ML integration  —  Cognizant

Full case studies & architecture detail at  sainikhil.com →


🛠️ Stack

mindmap
  root((Stack))
    AI and ML
      Python
      PyTorch
      TensorFlow
      XGBoost
      LightGBM
      Hugging Face
      SHAP
    Generative AI
      LangChain
      LlamaIndex
      CrewAI
      OpenAI
      Anthropic
      Azure OpenAI
      LoRA PEFT
    Data
      PySpark
      Pandas
      NumPy
      Snowflake
      Postgres
      MongoDB
    Vector DBs
      Pinecone
      ChromaDB
      FAISS
    Cloud
      AWS
      Vercel
      Firebase
    MLOps
      Docker
      Kubernetes
      MLflow
      Weights and Biases
      FastAPI
      Flask
      n8n
      CICD
Loading

🎯 Where my time goes

pie showData
    title    Focus distribution
    "Generative AI · RAG · Agents"   : 35
    "Healthcare ML"                  : 30
    "MLOps & Backend"                : 20
    "Data Engineering · Analytics"   : 15
Loading

🌱 Currently

  • 📍 Open to senior AI/ML and Generative-AI roles
  • 🛠️ Shipping healthcare RAG and decision-support systems at Molina Health
  • 📫 Reach me at  [email protected]

Production-first ML — offline accuracy is the start of the job; shipping is the rest.

🌐  sainikhil.com  →

Popular repositories Loading

  1. contribution-history contribution-history Public

    Generated GitHub contribution history

  2. mlops-app mlops-app Public

    End-to-end MLOps reference stack — Terraform on GCP, BigQuery, GitHub Actions, Docker, Prefect, dbt, MLflow, FastAPI.

    HCL

  3. ai_soc ai_soc Public

    Hands-on study of Srinivas et al. AI-Augmented SOC research (MDPI Informatics 2025). Original implementation by Abdul Bari — attribution preserved per Apache 2.0.

    Python

  4. multi-agentic-rag multi-agentic-rag Public

    Multi-agent RAG with router, retriever, query reformulator, web-search, fact-checker, and safety agents — Streamlit + LangChain + Groq.

    Python

  5. langchain-rag-retrieval-augmented-generation-for-document-understanding langchain-rag-retrieval-augmented-generation-for-document-understanding Public

    LangChain + FAISS + SentenceTransformer pipeline for grounded document Q&A — step-by-step Jupyter walkthrough on real PDFs.

    Jupyter Notebook

  6. rag-chatbot rag-chatbot Public

    Production-aware RAG chatbot with CI/CD, llama.cpp on M1 / NVIDIA, and explicit cost / latency / hallucination tradeoffs.

    Python