Skip to content

Commit 15d2176

Browse files
chore: revert the kvbm workaround since trtllm v1.3.0rc3 is upgraded (#6495)
1 parent 80cac7c commit 15d2176

2 files changed

Lines changed: 0 additions & 28 deletions

File tree

docs/pages/components/kvbm/kvbm-guide.md

Lines changed: 0 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -204,30 +204,6 @@ cd $DYNAMO_HOME/examples/backends/vllm
204204

205205
### Disaggregated Serving with TRT-LLM
206206

207-
> [!NOTE]
208-
> The latest TensorRT-LLM release (1.3.0rc1) is currently experiencing a request hang when running disaggregated serving with KVBM.
209-
> Please include the TensorRT-LLM commit id `18e611da773026a55d187870ebcfa95ff00c8482` when building the Dynamo TensorRT-LLM runtime image to test the KVBM + disaggregated serving feature.
210-
211-
```bash
212-
# Build the Dynamo TensorRT-LLM container using commit ID 18e611da773026a55d187870ebcfa95ff00c8482. Note: This build can take a long time.
213-
./container/build.sh --framework trtllm --tensorrtllm-commit 18e611da773026a55d187870ebcfa95ff00c8482 --tensorrtllm-git-url https://github.com/NVIDIA/TensorRT-LLM.git
214-
215-
# Launch the container
216-
./container/run.sh --framework trtllm -it --mount-workspace --use-nixl-gds
217-
```
218-
> [!NOTE]
219-
> Important: After logging into the Dynamo TensorRT-LLM runtime container, copy the Triton kernels into the container's virtual environment as a separate Python module.
220-
221-
```bash
222-
# Clone the TensorRT-LLM repo and copy the triton_kernels folder into the container as a Python module.
223-
git clone https://github.com/NVIDIA/TensorRT-LLM.git /tmp/TensorRT-LLM && \
224-
cd /tmp/TensorRT-LLM && \
225-
git checkout 18e611da773026a55d187870ebcfa95ff00c8482 && \
226-
cp -r triton_kernels /opt/dynamo/venv/lib/python3.12/site-packages/ && \
227-
cd /workspace && \
228-
rm -rf /tmp/TensorRT-LLM
229-
```
230-
231207
```bash
232208
# Launch prefill worker with KVBM
233209
python3 -m dynamo.trtllm \

tests/kvbm_integration/test_determinism_disagg.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -551,10 +551,6 @@ def tester(llm_server):
551551
class TestDeterminismDisagg(BaseTestDeterminism):
552552
"""Test class for determinism validation."""
553553

554-
@pytest.mark.skipif(
555-
check_module_available("tensorrt_llm"),
556-
reason="Skipping test until the TRT-LLM disagg hang issue is fixed. (https://github.com/NVIDIA/TensorRT-LLM/pull/11247)",
557-
)
558554
@pytest.mark.parametrize(
559555
"llm_server",
560556
[

0 commit comments

Comments
 (0)