Skip to content
View Dominic789654's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Dominic789654

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Dominic789654/README.md

Xiang Liu - Efficient LLM Inference and Agent Systems

Homepage Google Scholar X

Ph.D. student @ HKUST(GZ) · Research Intern @ Mind Lab
Efficient and reliable LLMs: inference, long context, KV cache, retrieval, and agentic workflows.


40+ stars
personal public non-fork repos
8.4k+ / 1.1k+
contributed projects: LMFlow / kvpress
benchmark → method → artifact
how I like research to ship

Current Focus

Inference efficiency
KV-cache compression, token-efficient reasoning, energy-to-token evaluation, serving bottlenecks.
Long-context evaluation
Generation-focused benchmarks, dense reasoning integrity, multi-turn coherence.
Agent systems
Tool use, post-training, harness design, local-first agent workflow infrastructure.
Research infrastructure
Reproducible artifacts, project pages, scholar tracking, figure and report tooling.

Selected Work

Contributed to an extensible toolkit for fine-tuning and inference of large foundation models.

stars Python

Long-context generation benchmark for coherent, context-aware long-form responses.

stars paper

Policy-conditioned live-market evaluation for LLM trading agents. Benchmark the policy, not just the model.

Python agents

Local-first agent task hub with SQLite queueing, dependency-aware dispatch, templates, and dashboards.

SQLite workflow

Adapters between XML-like tool calls and OpenAI-style structured tool-call histories.

Python tool use

Project page for evaluating LLM inference as energy-to-token production.

HTML serving

Research Map

long-context generation ──┬── LongGenBench
                          ├── semantic integrity under KV compression
                          └── multi-turn coherence / FlowKV

agent capability eval ────┬── QuantArena
                          ├── tool-use adapters
                          └── local-first agent workflow runtime

efficient inference ──────┬── ChunkKV / KV compression
                          ├── token-efficient reasoning
                          └── energy-to-token production

Stack

Python PyTorch TypeScript React SQLite LaTeX

GitHub stats Top languages

repositories · publications · citations

Popular repositories Loading

  1. LongGenBench LongGenBench Public

    Source code for the paper "LongGenBench: Long-context Generation Benchmark"

    Python 23 1

  2. critical-reviewer-skill critical-reviewer-skill Public

    A rigorous, critical reviewing style guide for academic paper review

    Ruby 3 1

  3. agent-hub agent-hub Public

    Local-first agent task hub with SQLite queueing, dependency-aware dispatch, task templates, pipelines, human inbox, saved queries, and a thin dashboard.

    Python 2

  4. scientific-figure scientific-figure Public

    Python 2

  5. Dominic789654.github.io Dominic789654.github.io Public

    Dom homepage

    TypeScript 1

  6. datawhale_lee_ML_note datawhale_lee_ML_note Public

    datawhale 李宏毅 《机器学习》 笔记

    1