You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/pages/components/kvbm/kvbm-guide.md
-24Lines changed: 0 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -204,30 +204,6 @@ cd $DYNAMO_HOME/examples/backends/vllm
204
204
205
205
### Disaggregated Serving with TRT-LLM
206
206
207
-
> [!NOTE]
208
-
> The latest TensorRT-LLM release (1.3.0rc1) is currently experiencing a request hang when running disaggregated serving with KVBM.
209
-
> Please include the TensorRT-LLM commit id `18e611da773026a55d187870ebcfa95ff00c8482` when building the Dynamo TensorRT-LLM runtime image to test the KVBM + disaggregated serving feature.
210
-
211
-
```bash
212
-
# Build the Dynamo TensorRT-LLM container using commit ID 18e611da773026a55d187870ebcfa95ff00c8482. Note: This build can take a long time.
> Important: After logging into the Dynamo TensorRT-LLM runtime container, copy the Triton kernels into the container's virtual environment as a separate Python module.
220
-
221
-
```bash
222
-
# Clone the TensorRT-LLM repo and copy the triton_kernels folder into the container as a Python module.
0 commit comments