chore: update ablation and clean configs

2026-04-15 14:48:59 +08:00
parent eaf574cd4e
commit 365ceac3be
15 changed files with 879 additions and 324 deletions
--- a/README.md
+++ b/README.md
@@ -58,11 +58,19 @@ Prints `summary.json` to stdout and writes the full output directory
 target/release/kvcache-sim ablate \
    --config configs/glm5-8xb200-hf.yaml \
    --routers random,least_loaded,least_tokens,min_pd,prefix_affinity \
+    --evict-policies lru \
    --output-dir runs/glm5_ablation
 ```

-Writes one subdirectory per router plus a combined
-`ablation.json` with side-by-side summaries.
+Writes `ablation.json` with one row per `router x evict_policy`.
+
+`ablate` currently supports only `lru` as a valid eviction policy. The
+aggregated output keeps the online prefill-time metrics
+(`ttft_mean/p50/p95/p99`) and omits `e2e`.
+
+The previous replay-based `belady` approximation has been removed from
+the CLI because it was not an exact full-hierarchy Belady algorithm and
+could produce misleading comparisons against `lru`.

 ### 3. Compute theoretical hit-rate ceilings (oracle)

@@ -115,7 +123,8 @@ so the same config can be reused across sweeps:
 | `--ttl-seconds <S>`      | `cluster.meta_store.ttl_seconds`          |

 `oracle` additionally takes `--capacity-blocks <N>` / `--per-instance`
-and `--out <PATH>`. `ablate` additionally takes `--routers <csv>`.
+and `--out <PATH>`. `ablate` additionally takes `--routers <csv>` and
+`--evict-policies <csv>` (currently only `lru`).

 ## Router modes

@@ -288,12 +297,8 @@ memory_time  = layers * weight_bytes_per_layer / gpu_mem_bw
 | Config | Model | Hardware | Instances | Trace |
 |--------|-------|----------|-----------|-------|
 | `glm5-8xb200-hf.yaml` | GLM-5 via HF config.json | 8xB200 preset | 32 | GLM coder blk512 |
-| `glm5-8xb200-blk512.yaml` | GLM-5 inline | 8xB200 inline | 64 | GLM coder blk512 |
-| `glm5-8xb200.yaml` | GLM-5 inline | 8xB200 inline | 8 | GLM coder blk512 |
+| `glm5-nvfp4-8xb300.yaml` | GLM-5-NVFP4 via HF config.json | 8xB300 preset | 8 | GLM coder blk512 |
 | `qwen3-coder-480b-8xh20.yaml` | Qwen3-Coder via HF | 8xH20 preset | 32 | Qwen coder blk16 |
-| `qwen2.5-coder-7b-h800.yaml` | Qwen2.5-7B inline | H800 inline | 16 | Qwen coder blk16 |
-| `qwen2.5-coder-7b-preset.yaml` | Qwen2.5-7B inline | H800 preset | 16 | Qwen coder blk16 |
-| `qwen2.5-coder-32b-h800.yaml` | Qwen2.5-32B inline | H800 inline | 16 | Qwen coder blk16 |

 ## Outputs