Ship anonymized benchmark trace w600_r0.0015_st30 + provenance
Whitelist the sampled replay trace (1214 reqs / 274 sessions / ~600 s) past the traces/ ignore so the repo is runnable without dash0 access. Metadata only (token counts, opaque KV-block hashes, timing, session structure) — no prompts/outputs/PII. traces/README documents schema, provenance (sampled from the internal GLM-5.1 production trace via scripts/sample_trace.py), and the regeneration command. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
5
.gitignore
vendored
5
.gitignore
vendored
@@ -3,7 +3,10 @@ __pycache__/
|
||||
.venv/
|
||||
*.egg-info/
|
||||
outputs/
|
||||
traces/
|
||||
traces/*
|
||||
# ship the anonymized sampled trace + its provenance (metadata only, no cleartext)
|
||||
!traces/w600_r0.0015_st30.jsonl
|
||||
!traces/README.md
|
||||
*.log
|
||||
.claude/
|
||||
# third_party/vllm tracked in git for patch management
|
||||
|
||||
45
traces/README.md
Normal file
45
traces/README.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# Benchmark trace
|
||||
|
||||
## `w600_r0.0015_st30.jsonl`
|
||||
|
||||
The primary replay trace for the routing / connector experiments
|
||||
(1214 requests, 274 sessions, ~600 s span). One JSON object per request:
|
||||
|
||||
```json
|
||||
{"chat_id": 1237198, "parent_chat_id": -1, "timestamp": 0.0,
|
||||
"input_length": 8228, "output_length": 21, "type": "coder", "turn": 1,
|
||||
"hash_ids": [12292995, ...], "session_id": "1237198"}
|
||||
```
|
||||
|
||||
| field | meaning |
|
||||
|---|---|
|
||||
| `input_length` / `output_length` | token **counts** only |
|
||||
| `hash_ids` | opaque integer KV-block hashes — shared ids ⇒ shared prefix (drives prefix-cache reuse in replay) |
|
||||
| `timestamp` | arrival offset (s) from trace start |
|
||||
| `turn` / `parent_chat_id` / `session_id` | multi-turn session structure |
|
||||
|
||||
**No cleartext.** There are no prompts, no model outputs, and no PII — only
|
||||
token counts, opaque block hashes, timing, and session structure. The replayer
|
||||
synthesizes dummy token sequences consistent with `hash_ids` so prefix-cache
|
||||
hit rates match the original workload.
|
||||
|
||||
### Provenance
|
||||
Sampled from the internal Alibaba GLM-5.1-formatted production trace
|
||||
(`~/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl` on dash0, ~2.1 M
|
||||
requests, 2 h) — not redistributable; only this anonymized sample is shipped.
|
||||
The filename encodes the sampling params: `w`=window-seconds, `r`=sample-ratio,
|
||||
`st`=max-single-turn-ratio.
|
||||
|
||||
Regenerate (requires the dash0 source):
|
||||
```bash
|
||||
python scripts/sample_trace.py \
|
||||
--input ~/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl \
|
||||
--output traces/w600_r0.0015_st30.jsonl \
|
||||
--window-seconds 600 --sample-ratio 0.0015 --max-single-turn-ratio 0.30 --seed 42
|
||||
```
|
||||
|
||||
### Replay
|
||||
```bash
|
||||
python -m replayer --trace traces/w600_r0.0015_st30.jsonl ...
|
||||
```
|
||||
See `replayer/` and `scripts/cache_aware_proxy.py`.
|
||||
1214
traces/w600_r0.0015_st30.jsonl
Normal file
1214
traces/w600_r0.0015_st30.jsonl
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user