Skip to content
View ree2raz's full-sized avatar
🚦
making AI work vertically...
🚦
making AI work vertically...

Block or report ree2raz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ree2raz/README.md

Automated compliance scoring and LLM evaluation for regulated contact centers.

Verifiable output. Deterministic grading. Traceable failure modes.

Currently: Taking pilot engagements: compliance audits, eval-harness builds, and LLM QA pipeline architecture for collections, direct sales, and BPO verticals.


Featured

  • auditguard-mcp — A compliance-aware MCP server. Seven-step pipeline: RBAC, PII detection, policy enforcement, audit logging. 15-case eval, 100% pass. Live demo

  • Scrutiny — FDCPA/Reg F call transcript audit in 60 seconds. 12-rule rubric with verbatim evidence quotes and statutory citations. Dual-path evaluator. Live demo · Blog post

  • RegTriage-OpenEnv — An OpenEnv RL environment where the reward signal is auditor approval. 12 tasks, severity-weighted F1, auto-fail caps. Live demo · Blog post

  • LLM Deploy Cost Calculator — Production-grade GPU sizing, cost comparison, and break-even analysis for LLM deployment. Architecture-aware VRAM (GQA, MLA, MoE), throughput model, replica multiplier, pricing tiers. 49 model variants, 38 API plans. Live tool · Blog post

  • Inference Bench — Reproducible vLLM vs SGLang vs llama.cpp benchmark on NVIDIA L4 (via Modal). Concurrent-request sweeps, TTFT/TPOT, tail latency (p95/p99), success rate. SGLang leads throughput (+10%), vLLM leads TTFT at low concurrency. Showcase

Production background

Previously shipped LLM features to enterprise production at a speech analytics company serving regulated contact centers: automated quality scoring, real-time compliance assistant, conversational analytics engine. Self-hosted on customer infrastructure. Zero data egress.

Writing

rituraj.info: notes on production ML, compliance systems, and agent architectures.

Work with me

Pilot engagements for mid-market collections agencies and direct-sales companies: FDCPA/FTC compliance audits, automated QA pipeline builds, LLM evaluation harness setup.

Pinned Loading

  1. auditguard-mcp auditguard-mcp Public

    A 7-stage compliance pipeline that gates every LLM tool call through RBAC, PII detection, policy enforcement, and audit logging — before and after execution.

    Python

  2. inference-bench inference-bench Public

    Reproducible head-to-head LLM serving benchmark: vLLM vs SGLang vs llama.cpp on a single L4 GPU. Measures throughput, TTFT, TPOT, tail latency, and success rate across concurrent request sweeps and…

    Python

  3. scrutiny scrutiny Public

    Scrutiny audits debt-collection call transcripts against 12 FDCPA/Reg F rules in under 60 seconds.

    Python

  4. RegTriage-OpenEnv RegTriage-OpenEnv Public

    RegTriage is an OpenEnv RL environment that trains agents to perform regulatory compliance auditing on financial services contact center transcripts

    Python 1