Initial commit: obsidian to gitea

This commit is contained in:
2026-05-07 15:04:41 +08:00
commit a57afa86b4
323 changed files with 42569 additions and 0 deletions

View File

@@ -0,0 +1,17 @@
Objective
- Workload-centric KV cache scheduling
- XPURemoting adaption for PhOS
Key Results
- Refactor vLLM benchmark tools to get more precise metrics
- Simulate different token lengths and hit rate to define hit rate's effect
- Modify XPURemoting to support new architecture
Last Week
- Implement a unified vLLM benchmark tool to get more precise metric results and provide a unified requests builder.
- Measure the effect of cache hit rate and try to define a good hit rate for real performance improvement.
- Merge XPURemoting with new features and support for PhOS.
Next Week
- Define a `good hit rate` for KV cache scheduling.
- Finish XPURemoting adaption.