Gahow Wang gahow
  • Joined on 2026-04-03
gahow pushed to main at gahow/kvcache-simulator 2026-04-17 10:19:27 +00:00
cad83cec2f Merge branch 'feature/bucket-aware-routing'
43ada0cfc0 feat: add bucket score global router
b5a6fb964c feat: wire bucket identities through driver outputs
3a84c15068 fix: harden bucket routing review follow-up
fa381b5db3 feat: add bucketed service and strict global routing
Compare 12 commits »
gahow pushed to main at gahow/kvcache-simulator 2026-04-17 02:56:34 +00:00
82b3e2985f chore
gahow pushed to main at gahow/kvcache-simulator 2026-04-16 06:30:32 +00:00
67eef78244 chore: git ignore
gahow pushed to main at gahow/kvcache-simulator 2026-04-16 06:23:56 +00:00
996511f300 feat: new router and benchmark setup
gahow pushed to main at gahow/kvcache-simulator 2026-04-15 11:42:34 +00:00
c86d931d8f feat(ablate): input-length bucketing + auto-instance sizing
gahow pushed to main at gahow/kvcache-simulator 2026-04-15 11:08:14 +00:00
a3f386c858 feat: update ttft modeling and add cache affinity
ff316c6873 fix: cache calculation
Compare 2 commits »
gahow pushed to main at gahow/kvcache-simulator 2026-04-15 06:49:03 +00:00
365ceac3be chore: update ablation and clean configs
gahow pushed to main at gahow/kvcache-simulator 2026-04-14 07:46:47 +00:00
eaf574cd4e fix: kvcache evict workflow
663ca9c5b9 Support compute_dtype for FP4/FP8 tensor core FLOPS selection
84696604e8 Add B300 GPU preset and GLM-5-NVFP4 on 8xB300 config
Compare 3 commits »
gahow pushed to main at gahow/aituner 2026-04-14 02:27:09 +00:00
bf286ef2a6 docs: add qwen235b prefill 7-day compare
gahow pushed to main at gahow/kvcache-simulator 2026-04-14 02:22:40 +00:00
8d41123418 Update README with full feature documentation
gahow pushed to main at gahow/kvcache-simulator 2026-04-13 17:17:05 +00:00
ec73a95e05 KVCache simulator for LLM serving cluster routing research
gahow created branch main in gahow/kvcache-simulator 2026-04-13 17:17:05 +00:00
gahow created repository gahow/kvcache-simulator 2026-04-13 17:15:04 +00:00
gahow pushed to main at gahow/aituner 2026-04-13 12:50:40 +00:00
26f3b46966 compare: add multi-candidate runner
gahow pushed to main at gahow/aituner 2026-04-13 01:39:34 +00:00
18ff644b32 configs: add qwen235b prefill tight ttft 0323 study
gahow pushed to main at gahow/aituner 2026-04-13 01:37:07 +00:00
bbecec4e9f docs: add qwen235b tight ttft prefill summary
gahow pushed to main at gahow/aituner 2026-04-13 01:33:03 +00:00
ee9ec3c60b docs: add qwen235b decode 0323 summary
gahow pushed to main at gahow/aituner 2026-04-13 01:16:32 +00:00
a1b96f7dd2 docs: update qwen27b 7-day compare
gahow pushed to main at gahow/aituner 2026-04-12 15:09:31 +00:00
4625fba487 trace: make window materialization atomic
gahow pushed to main at gahow/aituner 2026-04-12 14:43:08 +00:00
631a076498 trace: include weekend legacy windows
ade81b5549 docs: add qwen27b chat 0-8k compare summary
Compare 2 commits »