Workload characterization C1-C3 on full production trace
Joint/temporal characterizations of the full 051315 cluster trace (2.11M
req / 1.31M sessions / 2h), beyond the existing single-variable marginals:
- C1 mixture: 90.3% sessions single-turn, but multi-turn (9.7%) = 44% reqs /
67% prefill mass; continuation hazard rises 10%->94% (Lindy); heaviness
unpredictable at turn 1 (corr 0.04-0.15) => reactive routing justified.
- C2 resident/delta: resident context 11k->56k while new-prefill 2.7k->~200;
per-turn reuse ->99.6%; resident/delta ("PD tax") ->~250-450x.
- C3 prefill/decode: token mass 98.7% input / 1.3% output, BUT decode ~70% of
TIME (robust 68-71%); "decode negligible" is wrong (tokens != time). Correct
colo argument = roofline complementarity, not "no decode".
Maps each to (1) PD-colocation and (2) routing. compute_chars.py + chars.json
+ figs/workload_chars/. Raw-file exact validation (cached_tokens, real
timings) pending.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
81
analysis/workload_chars/README.md
Normal file
81
analysis/workload_chars/README.md
Normal file
@@ -0,0 +1,81 @@
|
|||||||
|
# Agentic workload characterization C1–C3 (full 051315 production trace)
|
||||||
|
|
||||||
|
Date 2026-05-29. Source: `trace-glm5.1-formatted/051315-051317.jsonl` on dash1
|
||||||
|
(release file, 2,114,220 requests / 1,307,276 sessions / 2h, type=100% `coder`).
|
||||||
|
This release file **is the full cluster-level production trace** — session skew
|
||||||
|
reproduces 46.5/66.5/74.6/87.5/96.0 exactly. Compute: `compute_chars.py`
|
||||||
|
(2-pass, ~65s, `~/ali-trace/.venv` python). Numbers: `chars.json`.
|
||||||
|
|
||||||
|
> ⚠️ **Cluster-level, not per-instance.** This is one cluster's aggregate stream.
|
||||||
|
> Concurrent-session counts have NO denominator of "8 instances" — do not compare
|
||||||
|
> them to a single deployment's instance count.
|
||||||
|
|
||||||
|
These three are NOT in the existing 13 analyzer figures (which are single-variable
|
||||||
|
marginals on the older 041x traces). C1–C3 are joint/temporal and argument-bearing.
|
||||||
|
|
||||||
|
## C1 — the workload is a MIXTURE, not "multi-turn agentic" (`c1_session_mixture.png`)
|
||||||
|
|
||||||
|
- **90.3%** of sessions are single-turn; mean 1.62 turns, p99=18, max=3091.
|
||||||
|
- But multi-turn sessions (9.7%) = **44.2% of requests** and **66.9% of input
|
||||||
|
(prefill) mass**. Single-turn = **60.2% of output (decode) mass**.
|
||||||
|
- Continuation hazard P(reach k+1 | reached k): turn1→2 only **10.2%**, but
|
||||||
|
turn2→3 50.6%, turn5→6 87%, turn12→13 **94.3%** (Lindy / Pareto).
|
||||||
|
- Predictability of heaviness at cold-start is near-zero:
|
||||||
|
corr(turn1_input, session_mass)=0.15, corr(turn1_input, n_turns)=**0.04**.
|
||||||
|
|
||||||
|
**Routing:** heaviness is unpredictable at session start → proactive placement
|
||||||
|
cannot pre-empt hot-pin → a REACTIVE mechanism (observable-load routing /
|
||||||
|
migration) is required. But once a session has shown depth, it almost surely
|
||||||
|
continues → "observed accumulated load" is the signal that works (not turn-1
|
||||||
|
features, not cost-model prediction). The single/multi optimal strategies are
|
||||||
|
opposite (load-balance the 90% one-shot sea vs affinity-pin the deep tail) and
|
||||||
|
you can't tell them apart at turn 1 → the only viable policy starts everyone
|
||||||
|
load-balanced and becomes sticky as turns accrue. This is exactly LPWL's
|
||||||
|
emergent behavior (`new_uncached≈input`→by-load; `new_uncached≈0`→sticks), so
|
||||||
|
C1 explains *why* a cache-aware-load score is the right shape — it auto-segments
|
||||||
|
the mixture with no classifier.
|
||||||
|
|
||||||
|
## C2 — marginal work collapses while resident state explodes (`c2_work_amortization.png`)
|
||||||
|
|
||||||
|
Per turn: resident context grows 11k→56k+ tokens while new prefill collapses
|
||||||
|
2.7k→~200 tokens; per-turn reuse climbs 83%→**99.6%**; resident/new ratio
|
||||||
|
("the PD tax") grows to ~250× by turn 12, ~450× by turn 30.
|
||||||
|
|
||||||
|
**PD-colocation:** the dominant cost is keeping ~50k+ resident KV available for
|
||||||
|
the next turn's tiny delta. Disaggregation physically splits a turn's prefill-KV
|
||||||
|
(P) and decode-KV (D), and the next turn's prefix = [prevPrompt + prevAnswer]
|
||||||
|
spans both → must be gathered/transferred; colocation keeps it local for free.
|
||||||
|
**Routing:** route on delta (`input − cache_hit`), never total input — C2 is the
|
||||||
|
trace-level justification for LPWL's score function.
|
||||||
|
|
||||||
|
## C3 — prefill/decode BALANCE (honest reframe) (`c3_prefill_decode_balance.png`)
|
||||||
|
|
||||||
|
- Token mass: 98.7% input / **1.3% output**; of input, 60% reused-prefix, 40%
|
||||||
|
new-prefill (28.6B new-prefill tokens vs 0.94B decode tokens).
|
||||||
|
- **But tokens ≠ time.** Under a per-request latency model (prefill@7k tok/s,
|
||||||
|
TPOT 10ms), aggregate decode-time share ≈ **70% (robust 68–71% across
|
||||||
|
constants)** — each decode token costs ~70–140× a prefill token. So this is
|
||||||
|
NOT a "decode is negligible" workload.
|
||||||
|
- Per-request the bottleneck FLIPS within a session: turn-1 (and the 90%
|
||||||
|
single-turn) is prefill-bound; turns ≥3 are strongly decode-bound.
|
||||||
|
|
||||||
|
**PD-colocation (correct argument):** the workload has *substantial* work on both
|
||||||
|
sides of the roofline — compute-bound prefill (~30% of time) and memory-bound
|
||||||
|
decode (~70%). Colocation interleaves them on one GPU (chunked prefill +
|
||||||
|
continuous batching) so compute and HBM bandwidth are both used; static
|
||||||
|
disaggregation strands P-instances bandwidth-idle and D-instances compute-idle.
|
||||||
|
The earlier "decode is 1.3% so nothing to isolate" instinct was WRONG (token vs
|
||||||
|
time confusion) — C3b is the correction.
|
||||||
|
|
||||||
|
**Caveat:** C3b's 70% is a per-request-latency-weighted estimate; batched decode
|
||||||
|
throughput will shift it. Ground-truth needs `-raw.jsonl` (`usage.cached_tokens`
|
||||||
|
for exact reuse; `backend_first_response_time_ms` / `total_cost_time_ms` for real
|
||||||
|
prefill vs decode wall time). Sampling that 522GB file is the next step.
|
||||||
|
|
||||||
|
## Goal mapping
|
||||||
|
|
||||||
|
| | argue PD-colocation | guide routing |
|
||||||
|
|---|---|---|
|
||||||
|
| C1 mixture + hazard | both segments favor colo (diff reasons) | reactive + auto-segment ⇒ LPWL shape |
|
||||||
|
| C2 resident/delta | the PD tax (transfer/split resident KV) | route on delta, not total |
|
||||||
|
| C3 prefill/decode | roofline complementarity (interleave) | per-req bottleneck flips within session |
|
||||||
964
analysis/workload_chars/chars.json
Normal file
964
analysis/workload_chars/chars.json
Normal file
@@ -0,0 +1,964 @@
|
|||||||
|
{
|
||||||
|
"mixture": {
|
||||||
|
"single_sessions": 1179990,
|
||||||
|
"multi_sessions": 127286,
|
||||||
|
"req_single_pct": 55.81207253738968,
|
||||||
|
"req_multi_pct": 44.187927462610325,
|
||||||
|
"in_single_pct": 33.12487590117447,
|
||||||
|
"in_multi_pct": 66.87512409882554,
|
||||||
|
"out_single_pct": 60.24502960903973,
|
||||||
|
"out_multi_pct": 39.75497039096027
|
||||||
|
},
|
||||||
|
"turns": {
|
||||||
|
"mean": 1.6172713336739908,
|
||||||
|
"p99": 18.0,
|
||||||
|
"max": 3091,
|
||||||
|
"single_turn_pct": 90.26326498765371
|
||||||
|
},
|
||||||
|
"hazard": {
|
||||||
|
"1": 0.102101621998721,
|
||||||
|
"2": 0.5062146469376287,
|
||||||
|
"3": 0.7351961756478754,
|
||||||
|
"4": 0.8113739305485657,
|
||||||
|
"5": 0.8723731546954472,
|
||||||
|
"6": 0.8669264241631353,
|
||||||
|
"7": 0.9093235352011023,
|
||||||
|
"8": 0.9240204920989971,
|
||||||
|
"9": 0.901725753553022,
|
||||||
|
"10": 0.9346178826585841,
|
||||||
|
"11": 0.9260597637248089,
|
||||||
|
"12": 0.9427685226874781,
|
||||||
|
"13": 0.91950119395065,
|
||||||
|
"14": 0.936865189289012,
|
||||||
|
"15": 0.9382160896883085,
|
||||||
|
"16": 0.9308646838684262,
|
||||||
|
"17": 0.9371561574269995,
|
||||||
|
"18": 0.9312862196131557,
|
||||||
|
"19": 0.9333279456925813,
|
||||||
|
"20": 0.9351459000779289,
|
||||||
|
"21": 0.9399074074074074,
|
||||||
|
"22": 0.9404984730568416,
|
||||||
|
"23": 0.9473132921336546,
|
||||||
|
"24": 0.9193940734188413,
|
||||||
|
"25": 0.9497294046903187,
|
||||||
|
"26": 0.9323793845764214,
|
||||||
|
"27": 0.9483906016569333,
|
||||||
|
"28": 0.9368466275239868,
|
||||||
|
"29": 0.9472638336900031
|
||||||
|
},
|
||||||
|
"token_mass": {
|
||||||
|
"total_input": 71116829368,
|
||||||
|
"total_output": 940765734,
|
||||||
|
"out_in_ratio_pct": 1.3228454394837104,
|
||||||
|
"new_prefill": 28616906067,
|
||||||
|
"reused_prefix": 42499923301,
|
||||||
|
"new_prefill_pct_of_input": 40.23928839532401
|
||||||
|
},
|
||||||
|
"decode_time_fraction": {
|
||||||
|
"optimistic_for_prefill": 0.6812079219496285,
|
||||||
|
"mid": 0.6970810590484581,
|
||||||
|
"pessimistic": 0.711448473592609
|
||||||
|
},
|
||||||
|
"per_turn": {
|
||||||
|
"turn": [
|
||||||
|
1,
|
||||||
|
2,
|
||||||
|
3,
|
||||||
|
4,
|
||||||
|
5,
|
||||||
|
6,
|
||||||
|
7,
|
||||||
|
8,
|
||||||
|
9,
|
||||||
|
10,
|
||||||
|
11,
|
||||||
|
12,
|
||||||
|
13,
|
||||||
|
14,
|
||||||
|
15,
|
||||||
|
16,
|
||||||
|
17,
|
||||||
|
18,
|
||||||
|
19,
|
||||||
|
20,
|
||||||
|
21,
|
||||||
|
22,
|
||||||
|
23,
|
||||||
|
24,
|
||||||
|
25,
|
||||||
|
26,
|
||||||
|
27,
|
||||||
|
28,
|
||||||
|
29,
|
||||||
|
30,
|
||||||
|
31,
|
||||||
|
32,
|
||||||
|
33,
|
||||||
|
34,
|
||||||
|
35,
|
||||||
|
36,
|
||||||
|
37,
|
||||||
|
38,
|
||||||
|
39,
|
||||||
|
40,
|
||||||
|
41,
|
||||||
|
42,
|
||||||
|
43,
|
||||||
|
44,
|
||||||
|
45,
|
||||||
|
46,
|
||||||
|
47,
|
||||||
|
48,
|
||||||
|
49,
|
||||||
|
50,
|
||||||
|
51,
|
||||||
|
52,
|
||||||
|
53,
|
||||||
|
54,
|
||||||
|
55,
|
||||||
|
56,
|
||||||
|
57,
|
||||||
|
58,
|
||||||
|
59,
|
||||||
|
60,
|
||||||
|
61,
|
||||||
|
62,
|
||||||
|
63,
|
||||||
|
64,
|
||||||
|
65,
|
||||||
|
66,
|
||||||
|
67,
|
||||||
|
68,
|
||||||
|
69,
|
||||||
|
70,
|
||||||
|
71,
|
||||||
|
72,
|
||||||
|
73,
|
||||||
|
74,
|
||||||
|
75,
|
||||||
|
76,
|
||||||
|
77,
|
||||||
|
78,
|
||||||
|
79,
|
||||||
|
80,
|
||||||
|
81,
|
||||||
|
82,
|
||||||
|
83,
|
||||||
|
84,
|
||||||
|
85,
|
||||||
|
86,
|
||||||
|
87,
|
||||||
|
88,
|
||||||
|
89,
|
||||||
|
90,
|
||||||
|
91,
|
||||||
|
92,
|
||||||
|
93,
|
||||||
|
94,
|
||||||
|
95,
|
||||||
|
96,
|
||||||
|
97,
|
||||||
|
98,
|
||||||
|
99,
|
||||||
|
100,
|
||||||
|
101,
|
||||||
|
102,
|
||||||
|
103,
|
||||||
|
104,
|
||||||
|
105,
|
||||||
|
106,
|
||||||
|
107,
|
||||||
|
108,
|
||||||
|
109,
|
||||||
|
110,
|
||||||
|
111,
|
||||||
|
112,
|
||||||
|
113,
|
||||||
|
114,
|
||||||
|
115,
|
||||||
|
116,
|
||||||
|
117,
|
||||||
|
118,
|
||||||
|
119,
|
||||||
|
120,
|
||||||
|
121,
|
||||||
|
122,
|
||||||
|
123,
|
||||||
|
124,
|
||||||
|
125,
|
||||||
|
126,
|
||||||
|
127,
|
||||||
|
128,
|
||||||
|
129,
|
||||||
|
130,
|
||||||
|
131,
|
||||||
|
132,
|
||||||
|
133,
|
||||||
|
134,
|
||||||
|
135,
|
||||||
|
136,
|
||||||
|
137,
|
||||||
|
138,
|
||||||
|
139,
|
||||||
|
140,
|
||||||
|
141,
|
||||||
|
142,
|
||||||
|
143,
|
||||||
|
144,
|
||||||
|
145,
|
||||||
|
146,
|
||||||
|
147,
|
||||||
|
148
|
||||||
|
],
|
||||||
|
"med_resident_input": [
|
||||||
|
11035.0,
|
||||||
|
19505.0,
|
||||||
|
28059.0,
|
||||||
|
35089.0,
|
||||||
|
41215.0,
|
||||||
|
44750.0,
|
||||||
|
47419.5,
|
||||||
|
49874.0,
|
||||||
|
51905.0,
|
||||||
|
53068.0,
|
||||||
|
54782.0,
|
||||||
|
56414.0,
|
||||||
|
58229.0,
|
||||||
|
59123.5,
|
||||||
|
60434.5,
|
||||||
|
61320.0,
|
||||||
|
62243.0,
|
||||||
|
63411.0,
|
||||||
|
64510.5,
|
||||||
|
65423.0,
|
||||||
|
66942.5,
|
||||||
|
67965.0,
|
||||||
|
68826.0,
|
||||||
|
70165.5,
|
||||||
|
70052.0,
|
||||||
|
70936.0,
|
||||||
|
71547.0,
|
||||||
|
72648.0,
|
||||||
|
73406.0,
|
||||||
|
73844.0,
|
||||||
|
73604.0,
|
||||||
|
74937.5,
|
||||||
|
74778.0,
|
||||||
|
75460.0,
|
||||||
|
75029.0,
|
||||||
|
74978.0,
|
||||||
|
75933.0,
|
||||||
|
76590.0,
|
||||||
|
74695.0,
|
||||||
|
76813.0,
|
||||||
|
77079.5,
|
||||||
|
78310.0,
|
||||||
|
77848.0,
|
||||||
|
77549.0,
|
||||||
|
78203.0,
|
||||||
|
79102.0,
|
||||||
|
79202.0,
|
||||||
|
78821.0,
|
||||||
|
79868.0,
|
||||||
|
80229.5,
|
||||||
|
80912.0,
|
||||||
|
81620.0,
|
||||||
|
81612.5,
|
||||||
|
81836.5,
|
||||||
|
82506.0,
|
||||||
|
82948.0,
|
||||||
|
82633.0,
|
||||||
|
84107.5,
|
||||||
|
84176.0,
|
||||||
|
84441.0,
|
||||||
|
84101.0,
|
||||||
|
85192.0,
|
||||||
|
84127.0,
|
||||||
|
84783.5,
|
||||||
|
85087.0,
|
||||||
|
85771.5,
|
||||||
|
86110.0,
|
||||||
|
85374.5,
|
||||||
|
87137.0,
|
||||||
|
87677.0,
|
||||||
|
88587.0,
|
||||||
|
88656.0,
|
||||||
|
88882.0,
|
||||||
|
89284.0,
|
||||||
|
91512.0,
|
||||||
|
89850.0,
|
||||||
|
90596.0,
|
||||||
|
91244.0,
|
||||||
|
92102.0,
|
||||||
|
93431.0,
|
||||||
|
92333.5,
|
||||||
|
96682.0,
|
||||||
|
94999.0,
|
||||||
|
95226.5,
|
||||||
|
95173.0,
|
||||||
|
95910.0,
|
||||||
|
96528.0,
|
||||||
|
96508.0,
|
||||||
|
97270.0,
|
||||||
|
97301.0,
|
||||||
|
97076.5,
|
||||||
|
97105.0,
|
||||||
|
98032.0,
|
||||||
|
97962.5,
|
||||||
|
97968.5,
|
||||||
|
98310.0,
|
||||||
|
97061.0,
|
||||||
|
97631.0,
|
||||||
|
100126.0,
|
||||||
|
97765.0,
|
||||||
|
101076.0,
|
||||||
|
98198.5,
|
||||||
|
98678.0,
|
||||||
|
98307.0,
|
||||||
|
99174.0,
|
||||||
|
99882.0,
|
||||||
|
99974.0,
|
||||||
|
99757.0,
|
||||||
|
100065.5,
|
||||||
|
99943.0,
|
||||||
|
100612.0,
|
||||||
|
101138.0,
|
||||||
|
106738.0,
|
||||||
|
99621.0,
|
||||||
|
101980.0,
|
||||||
|
102252.0,
|
||||||
|
103018.0,
|
||||||
|
101238.0,
|
||||||
|
102005.0,
|
||||||
|
101897.0,
|
||||||
|
103576.0,
|
||||||
|
102159.5,
|
||||||
|
102695.5,
|
||||||
|
100590.5,
|
||||||
|
103236.0,
|
||||||
|
101812.0,
|
||||||
|
103074.0,
|
||||||
|
99966.0,
|
||||||
|
102183.5,
|
||||||
|
101882.0,
|
||||||
|
102572.5,
|
||||||
|
105622.5,
|
||||||
|
106066.0,
|
||||||
|
103974.0,
|
||||||
|
105443.5,
|
||||||
|
104716.0,
|
||||||
|
105041.0,
|
||||||
|
106628.0,
|
||||||
|
108320.0,
|
||||||
|
108022.5,
|
||||||
|
107621.5,
|
||||||
|
107664.0,
|
||||||
|
107913.0,
|
||||||
|
108630.0,
|
||||||
|
108382.0,
|
||||||
|
107216.5,
|
||||||
|
105731.0,
|
||||||
|
103986.0
|
||||||
|
],
|
||||||
|
"med_new_prefill": [
|
||||||
|
11035.0,
|
||||||
|
2920.0,
|
||||||
|
1249.0,
|
||||||
|
767.0,
|
||||||
|
628.0,
|
||||||
|
485.0,
|
||||||
|
400.0,
|
||||||
|
359.0,
|
||||||
|
314.0,
|
||||||
|
274.0,
|
||||||
|
263.0,
|
||||||
|
258.0,
|
||||||
|
244.0,
|
||||||
|
231.0,
|
||||||
|
227.0,
|
||||||
|
222.0,
|
||||||
|
201.0,
|
||||||
|
200.0,
|
||||||
|
198.0,
|
||||||
|
189.0,
|
||||||
|
182.5,
|
||||||
|
184.0,
|
||||||
|
179.0,
|
||||||
|
188.0,
|
||||||
|
173.0,
|
||||||
|
180.0,
|
||||||
|
164.0,
|
||||||
|
167.0,
|
||||||
|
159.5,
|
||||||
|
168.0,
|
||||||
|
156.0,
|
||||||
|
174.0,
|
||||||
|
156.0,
|
||||||
|
159.0,
|
||||||
|
166.0,
|
||||||
|
165.0,
|
||||||
|
153.0,
|
||||||
|
158.0,
|
||||||
|
182.0,
|
||||||
|
149.0,
|
||||||
|
184.0,
|
||||||
|
172.0,
|
||||||
|
149.0,
|
||||||
|
167.0,
|
||||||
|
163.0,
|
||||||
|
152.0,
|
||||||
|
153.0,
|
||||||
|
171.0,
|
||||||
|
151.0,
|
||||||
|
146.0,
|
||||||
|
162.0,
|
||||||
|
153.0,
|
||||||
|
156.0,
|
||||||
|
164.0,
|
||||||
|
148.0,
|
||||||
|
143.0,
|
||||||
|
143.0,
|
||||||
|
149.0,
|
||||||
|
170.5,
|
||||||
|
159.0,
|
||||||
|
144.0,
|
||||||
|
168.0,
|
||||||
|
148.0,
|
||||||
|
144.5,
|
||||||
|
142.5,
|
||||||
|
146.5,
|
||||||
|
147.0,
|
||||||
|
157.0,
|
||||||
|
168.0,
|
||||||
|
153.0,
|
||||||
|
155.0,
|
||||||
|
127.5,
|
||||||
|
145.0,
|
||||||
|
143.0,
|
||||||
|
146.0,
|
||||||
|
123.0,
|
||||||
|
139.0,
|
||||||
|
137.0,
|
||||||
|
115.0,
|
||||||
|
139.5,
|
||||||
|
117.0,
|
||||||
|
154.0,
|
||||||
|
111.0,
|
||||||
|
124.0,
|
||||||
|
118.0,
|
||||||
|
90.0,
|
||||||
|
104.0,
|
||||||
|
116.0,
|
||||||
|
112.0,
|
||||||
|
76.5,
|
||||||
|
110.0,
|
||||||
|
101.0,
|
||||||
|
123.0,
|
||||||
|
114.0,
|
||||||
|
86.0,
|
||||||
|
92.0,
|
||||||
|
108.0,
|
||||||
|
85.0,
|
||||||
|
146.0,
|
||||||
|
77.5,
|
||||||
|
101.0,
|
||||||
|
102.0,
|
||||||
|
85.0,
|
||||||
|
77.0,
|
||||||
|
114.0,
|
||||||
|
66.0,
|
||||||
|
105.0,
|
||||||
|
90.0,
|
||||||
|
89.0,
|
||||||
|
100.0,
|
||||||
|
108.5,
|
||||||
|
100.0,
|
||||||
|
169.0,
|
||||||
|
89.0,
|
||||||
|
106.5,
|
||||||
|
78.0,
|
||||||
|
75.0,
|
||||||
|
90.0,
|
||||||
|
77.0,
|
||||||
|
88.0,
|
||||||
|
102.0,
|
||||||
|
83.5,
|
||||||
|
123.5,
|
||||||
|
116.5,
|
||||||
|
108.0,
|
||||||
|
119.0,
|
||||||
|
82.0,
|
||||||
|
80.0,
|
||||||
|
105.0,
|
||||||
|
90.0,
|
||||||
|
91.0,
|
||||||
|
113.0,
|
||||||
|
122.0,
|
||||||
|
102.0,
|
||||||
|
101.5,
|
||||||
|
64.0,
|
||||||
|
78.0,
|
||||||
|
52.5,
|
||||||
|
98.5,
|
||||||
|
72.0,
|
||||||
|
87.0,
|
||||||
|
102.0,
|
||||||
|
97.0,
|
||||||
|
123.0,
|
||||||
|
80.0,
|
||||||
|
132.5,
|
||||||
|
86.5,
|
||||||
|
111.0
|
||||||
|
],
|
||||||
|
"med_output": [
|
||||||
|
63.0,
|
||||||
|
67.0,
|
||||||
|
111.0,
|
||||||
|
142.0,
|
||||||
|
158.0,
|
||||||
|
162.0,
|
||||||
|
164.0,
|
||||||
|
164.0,
|
||||||
|
159.0,
|
||||||
|
160.0,
|
||||||
|
159.0,
|
||||||
|
161.0,
|
||||||
|
160.0,
|
||||||
|
158.0,
|
||||||
|
154.0,
|
||||||
|
154.0,
|
||||||
|
154.0,
|
||||||
|
149.0,
|
||||||
|
146.0,
|
||||||
|
147.0,
|
||||||
|
142.0,
|
||||||
|
144.0,
|
||||||
|
143.0,
|
||||||
|
142.0,
|
||||||
|
140.0,
|
||||||
|
136.0,
|
||||||
|
137.0,
|
||||||
|
139.0,
|
||||||
|
136.0,
|
||||||
|
133.0,
|
||||||
|
130.0,
|
||||||
|
131.0,
|
||||||
|
125.0,
|
||||||
|
123.0,
|
||||||
|
122.0,
|
||||||
|
122.0,
|
||||||
|
118.0,
|
||||||
|
122.0,
|
||||||
|
114.0,
|
||||||
|
112.0,
|
||||||
|
115.0,
|
||||||
|
111.0,
|
||||||
|
109.0,
|
||||||
|
112.0,
|
||||||
|
109.0,
|
||||||
|
107.0,
|
||||||
|
111.0,
|
||||||
|
105.0,
|
||||||
|
108.0,
|
||||||
|
107.0,
|
||||||
|
100.0,
|
||||||
|
100.0,
|
||||||
|
95.0,
|
||||||
|
105.0,
|
||||||
|
103.0,
|
||||||
|
102.0,
|
||||||
|
100.0,
|
||||||
|
100.0,
|
||||||
|
98.0,
|
||||||
|
98.0,
|
||||||
|
101.0,
|
||||||
|
99.0,
|
||||||
|
101.0,
|
||||||
|
102.0,
|
||||||
|
97.0,
|
||||||
|
91.0,
|
||||||
|
100.0,
|
||||||
|
97.0,
|
||||||
|
94.0,
|
||||||
|
98.5,
|
||||||
|
92.5,
|
||||||
|
97.0,
|
||||||
|
102.0,
|
||||||
|
92.0,
|
||||||
|
95.0,
|
||||||
|
91.0,
|
||||||
|
91.0,
|
||||||
|
92.0,
|
||||||
|
85.0,
|
||||||
|
98.0,
|
||||||
|
96.0,
|
||||||
|
99.0,
|
||||||
|
94.0,
|
||||||
|
96.0,
|
||||||
|
90.0,
|
||||||
|
85.0,
|
||||||
|
99.0,
|
||||||
|
86.0,
|
||||||
|
99.0,
|
||||||
|
93.0,
|
||||||
|
92.0,
|
||||||
|
93.0,
|
||||||
|
87.0,
|
||||||
|
83.0,
|
||||||
|
87.5,
|
||||||
|
82.0,
|
||||||
|
80.0,
|
||||||
|
90.0,
|
||||||
|
92.0,
|
||||||
|
80.0,
|
||||||
|
77.0,
|
||||||
|
82.0,
|
||||||
|
87.0,
|
||||||
|
74.0,
|
||||||
|
83.0,
|
||||||
|
79.0,
|
||||||
|
84.0,
|
||||||
|
80.5,
|
||||||
|
79.0,
|
||||||
|
76.0,
|
||||||
|
78.5,
|
||||||
|
71.5,
|
||||||
|
81.0,
|
||||||
|
87.0,
|
||||||
|
82.0,
|
||||||
|
85.0,
|
||||||
|
87.0,
|
||||||
|
75.0,
|
||||||
|
75.0,
|
||||||
|
82.0,
|
||||||
|
86.0,
|
||||||
|
76.5,
|
||||||
|
77.5,
|
||||||
|
70.0,
|
||||||
|
78.0,
|
||||||
|
85.0,
|
||||||
|
77.0,
|
||||||
|
67.0,
|
||||||
|
76.5,
|
||||||
|
107.0,
|
||||||
|
92.0,
|
||||||
|
80.5,
|
||||||
|
85.0,
|
||||||
|
83.0,
|
||||||
|
77.0,
|
||||||
|
70.0,
|
||||||
|
84.0,
|
||||||
|
69.0,
|
||||||
|
97.0,
|
||||||
|
72.0,
|
||||||
|
81.0,
|
||||||
|
87.0,
|
||||||
|
89.0,
|
||||||
|
102.0,
|
||||||
|
83.0,
|
||||||
|
82.5,
|
||||||
|
91.0,
|
||||||
|
79.5
|
||||||
|
],
|
||||||
|
"resident_over_new": [
|
||||||
|
1.0,
|
||||||
|
6.679794520547945,
|
||||||
|
22.46517213771017,
|
||||||
|
45.748370273794,
|
||||||
|
65.62898089171975,
|
||||||
|
92.26804123711341,
|
||||||
|
118.54875,
|
||||||
|
138.92479108635098,
|
||||||
|
165.30254777070064,
|
||||||
|
193.67883211678833,
|
||||||
|
208.29657794676805,
|
||||||
|
218.65891472868216,
|
||||||
|
238.64344262295083,
|
||||||
|
255.94588744588745,
|
||||||
|
266.23127753303964,
|
||||||
|
276.2162162162162,
|
||||||
|
309.6666666666667,
|
||||||
|
317.055,
|
||||||
|
325.81060606060606,
|
||||||
|
346.15343915343914,
|
||||||
|
366.8082191780822,
|
||||||
|
369.375,
|
||||||
|
384.5027932960894,
|
||||||
|
373.22074468085106,
|
||||||
|
404.9248554913295,
|
||||||
|
394.0888888888889,
|
||||||
|
436.2621951219512,
|
||||||
|
435.0179640718563,
|
||||||
|
460.2257053291536,
|
||||||
|
439.54761904761904,
|
||||||
|
471.8205128205128,
|
||||||
|
430.67528735632186,
|
||||||
|
479.34615384615387,
|
||||||
|
474.59119496855345,
|
||||||
|
451.98192771084337,
|
||||||
|
454.41212121212124,
|
||||||
|
496.29411764705884,
|
||||||
|
484.746835443038,
|
||||||
|
410.4120879120879,
|
||||||
|
515.5234899328859,
|
||||||
|
418.9103260869565,
|
||||||
|
455.2906976744186,
|
||||||
|
522.4697986577181,
|
||||||
|
464.36526946107784,
|
||||||
|
479.7730061349693,
|
||||||
|
520.4078947368421,
|
||||||
|
517.6601307189543,
|
||||||
|
460.94152046783626,
|
||||||
|
528.9271523178808,
|
||||||
|
549.5171232876712,
|
||||||
|
499.4567901234568,
|
||||||
|
533.4640522875817,
|
||||||
|
523.1570512820513,
|
||||||
|
499.0030487804878,
|
||||||
|
557.472972972973,
|
||||||
|
580.0559440559441,
|
||||||
|
577.8531468531469,
|
||||||
|
564.4798657718121,
|
||||||
|
493.7008797653959,
|
||||||
|
531.0754716981132,
|
||||||
|
584.0347222222222,
|
||||||
|
507.0952380952381,
|
||||||
|
568.4256756756756,
|
||||||
|
586.7370242214533,
|
||||||
|
597.1017543859649,
|
||||||
|
585.4709897610921,
|
||||||
|
585.7823129251701,
|
||||||
|
543.7866242038217,
|
||||||
|
518.672619047619,
|
||||||
|
573.0522875816994,
|
||||||
|
571.5290322580645,
|
||||||
|
695.3411764705883,
|
||||||
|
612.9793103448276,
|
||||||
|
624.3636363636364,
|
||||||
|
626.7945205479452,
|
||||||
|
730.4878048780488,
|
||||||
|
651.7697841726618,
|
||||||
|
666.014598540146,
|
||||||
|
800.8869565217391,
|
||||||
|
669.7562724014336,
|
||||||
|
789.1752136752136,
|
||||||
|
627.8051948051948,
|
||||||
|
855.8468468468468,
|
||||||
|
767.9556451612904,
|
||||||
|
806.5508474576271,
|
||||||
|
1065.6666666666667,
|
||||||
|
928.1538461538462,
|
||||||
|
831.9655172413793,
|
||||||
|
868.4821428571429,
|
||||||
|
1271.9084967320262,
|
||||||
|
882.5136363636364,
|
||||||
|
961.4356435643564,
|
||||||
|
797.0081300813008,
|
||||||
|
859.3201754385965,
|
||||||
|
1139.1686046511627,
|
||||||
|
1068.5869565217392,
|
||||||
|
898.7129629629629,
|
||||||
|
1148.6,
|
||||||
|
685.7945205479452,
|
||||||
|
1261.483870967742,
|
||||||
|
1000.7524752475248,
|
||||||
|
962.7303921568628,
|
||||||
|
1160.9176470588236,
|
||||||
|
1276.7142857142858,
|
||||||
|
869.9473684210526,
|
||||||
|
1513.3636363636363,
|
||||||
|
952.1333333333333,
|
||||||
|
1108.411111111111,
|
||||||
|
1124.3314606741574,
|
||||||
|
999.43,
|
||||||
|
927.2995391705069,
|
||||||
|
1011.38,
|
||||||
|
631.5857988165681,
|
||||||
|
1119.3370786516855,
|
||||||
|
957.5586854460093,
|
||||||
|
1310.923076923077,
|
||||||
|
1373.5733333333333,
|
||||||
|
1124.8666666666666,
|
||||||
|
1324.7402597402597,
|
||||||
|
1157.9204545454545,
|
||||||
|
1015.4509803921569,
|
||||||
|
1223.4670658682635,
|
||||||
|
831.5425101214574,
|
||||||
|
863.4377682403433,
|
||||||
|
955.8888888888889,
|
||||||
|
855.563025210084,
|
||||||
|
1257.0,
|
||||||
|
1249.575,
|
||||||
|
973.1761904761905,
|
||||||
|
1132.0222222222221,
|
||||||
|
1127.1703296703297,
|
||||||
|
934.712389380531,
|
||||||
|
869.3934426229508,
|
||||||
|
1019.3529411764706,
|
||||||
|
1038.8522167487686,
|
||||||
|
1636.1875,
|
||||||
|
1346.679487179487,
|
||||||
|
2031.009523809524,
|
||||||
|
1099.6954314720813,
|
||||||
|
1500.3125,
|
||||||
|
1237.028735632184,
|
||||||
|
1055.5294117647059,
|
||||||
|
1112.5051546391753,
|
||||||
|
883.170731707317,
|
||||||
|
1354.775,
|
||||||
|
809.1811320754717,
|
||||||
|
1222.3236994219653,
|
||||||
|
936.8108108108108
|
||||||
|
],
|
||||||
|
"reuse_pct": [
|
||||||
|
0.0,
|
||||||
|
85.02947962061009,
|
||||||
|
95.5486653123775,
|
||||||
|
97.81412978426287,
|
||||||
|
98.47628290670872,
|
||||||
|
98.91620111731844,
|
||||||
|
99.1564651672835,
|
||||||
|
99.28018606889361,
|
||||||
|
99.39504864656584,
|
||||||
|
99.48368131453984,
|
||||||
|
99.5199153006462,
|
||||||
|
99.54266671393626,
|
||||||
|
99.5809648113483,
|
||||||
|
99.60929241333818,
|
||||||
|
99.62438673274372,
|
||||||
|
99.63796477495107,
|
||||||
|
99.67707212055974,
|
||||||
|
99.68459730961506,
|
||||||
|
99.69307322063851,
|
||||||
|
99.71111077144124,
|
||||||
|
99.72737797363409,
|
||||||
|
99.72927241962775,
|
||||||
|
99.73992386598088,
|
||||||
|
99.73206205328829,
|
||||||
|
99.75304059841261,
|
||||||
|
99.74625014097215,
|
||||||
|
99.7707800466826,
|
||||||
|
99.77012443563484,
|
||||||
|
99.78271530937526,
|
||||||
|
99.77249336438979,
|
||||||
|
99.78805499701103,
|
||||||
|
99.76780650542119,
|
||||||
|
99.79138249217684,
|
||||||
|
99.78929234031276,
|
||||||
|
99.77875221580989,
|
||||||
|
99.77993544773133,
|
||||||
|
99.7985065781676,
|
||||||
|
99.7937067502285,
|
||||||
|
99.75634245933462,
|
||||||
|
99.80602241808027,
|
||||||
|
99.76128542608606,
|
||||||
|
99.78036010726599,
|
||||||
|
99.80860137704244,
|
||||||
|
99.78465228436214,
|
||||||
|
99.79156809841055,
|
||||||
|
99.8078430381027,
|
||||||
|
99.80682306002375,
|
||||||
|
99.7830527397521,
|
||||||
|
99.81093804777883,
|
||||||
|
99.81802204924622,
|
||||||
|
99.79978247973106,
|
||||||
|
99.8125459446214,
|
||||||
|
99.8088528105376,
|
||||||
|
99.79960042279423,
|
||||||
|
99.82061910648923,
|
||||||
|
99.82760283551141,
|
||||||
|
99.82694565125313,
|
||||||
|
99.822845762863,
|
||||||
|
99.79744820376354,
|
||||||
|
99.81170284577398,
|
||||||
|
99.82877730348034,
|
||||||
|
99.80279838482487,
|
||||||
|
99.82407550489143,
|
||||||
|
99.82956589430727,
|
||||||
|
99.8325243574224,
|
||||||
|
99.82919734410615,
|
||||||
|
99.82928811984671,
|
||||||
|
99.81610434028896,
|
||||||
|
99.80720015607606,
|
||||||
|
99.82549585410084,
|
||||||
|
99.8250307607211,
|
||||||
|
99.85618570655116,
|
||||||
|
99.83686235683265,
|
||||||
|
99.83983692486895,
|
||||||
|
99.84045808200017,
|
||||||
|
99.86310517529216,
|
||||||
|
99.84657159256479,
|
||||||
|
99.84985314102846,
|
||||||
|
99.87513843347593,
|
||||||
|
99.85069195449047,
|
||||||
|
99.87328542728262,
|
||||||
|
99.84071492108149,
|
||||||
|
99.88315666480699,
|
||||||
|
99.8697841462198,
|
||||||
|
99.87601525642776,
|
||||||
|
99.90616202690022,
|
||||||
|
99.89225924084204,
|
||||||
|
99.87980271065611,
|
||||||
|
99.88485658476407,
|
||||||
|
99.9213779920042,
|
||||||
|
99.88668730331234,
|
||||||
|
99.89598887801864,
|
||||||
|
99.87453076546434,
|
||||||
|
99.88362893964528,
|
||||||
|
99.91221668189266,
|
||||||
|
99.90641847217984,
|
||||||
|
99.88872976787793,
|
||||||
|
99.91293748911718,
|
||||||
|
99.8541837285021,
|
||||||
|
99.92072827699074,
|
||||||
|
99.90007519094543,
|
||||||
|
99.89612875960428,
|
||||||
|
99.91386124566772,
|
||||||
|
99.92167393980083,
|
||||||
|
99.88505051727267,
|
||||||
|
99.93392202799302,
|
||||||
|
99.89497269290015,
|
||||||
|
99.90978076726445,
|
||||||
|
99.91105825684177,
|
||||||
|
99.89994296749147,
|
||||||
|
99.89215998091679,
|
||||||
|
99.90112519527774,
|
||||||
|
99.84166838426802,
|
||||||
|
99.91066140673152,
|
||||||
|
99.89556775838399,
|
||||||
|
99.92371787348903,
|
||||||
|
99.9271971888408,
|
||||||
|
99.91110057488295,
|
||||||
|
99.92451350423998,
|
||||||
|
99.91363828179436,
|
||||||
|
99.90152158801267,
|
||||||
|
99.91826506590185,
|
||||||
|
99.87974156608614,
|
||||||
|
99.8841838941053,
|
||||||
|
99.89538533069859,
|
||||||
|
99.883117903587,
|
||||||
|
99.92044550517105,
|
||||||
|
99.91997279074886,
|
||||||
|
99.89724368415645,
|
||||||
|
99.91166251153295,
|
||||||
|
99.91128226376466,
|
||||||
|
99.89301521929514,
|
||||||
|
99.88497727829841,
|
||||||
|
99.90189855156096,
|
||||||
|
99.9037399175862,
|
||||||
|
99.93888231024867,
|
||||||
|
99.92574328119497,
|
||||||
|
99.95076340173313,
|
||||||
|
99.90906573116692,
|
||||||
|
99.9333472193293,
|
||||||
|
99.91916113415999,
|
||||||
|
99.90526081141329,
|
||||||
|
99.91011277603255,
|
||||||
|
99.88677161005248,
|
||||||
|
99.92618700522226,
|
||||||
|
99.8764182751722,
|
||||||
|
99.91818861071965,
|
||||||
|
99.89325486123131
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
180
analysis/workload_chars/compute_chars.py
Normal file
180
analysis/workload_chars/compute_chars.py
Normal file
@@ -0,0 +1,180 @@
|
|||||||
|
import json, sys, math, statistics as st
|
||||||
|
from collections import defaultdict, Counter
|
||||||
|
import matplotlib; matplotlib.use("Agg")
|
||||||
|
import matplotlib.pyplot as plt
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
PATH="/home/admin/cpfs/wjh/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl"
|
||||||
|
OUT="/tmp/wlc_out"; import os; os.makedirs(OUT, exist_ok=True)
|
||||||
|
BLOCK=512
|
||||||
|
# --- transparent cost model for C3 (clearly-labeled estimate; raw-timing validation pending) ---
|
||||||
|
PREFILL_TOK_S=7000.0 # MB1: 32k->4.5s ~7100 tok/s effective on H20 / 30B-A3B
|
||||||
|
TPOT_S=0.010 # ~10ms/token decode (crossover unloaded ~5ms, loaded ~25ms)
|
||||||
|
|
||||||
|
def pct(v,p):
|
||||||
|
if not v: return float('nan')
|
||||||
|
s=sorted(v);k=(len(s)-1)*p;f=int(k)
|
||||||
|
return s[f] if f+1>=len(s) else s[f]+(s[f+1]-s[f])*(k-f)
|
||||||
|
|
||||||
|
# ---------- Pass A: structure (scalars only) ----------
|
||||||
|
parents={}; recs={}; childcount=Counter()
|
||||||
|
for line in open(PATH):
|
||||||
|
if not line.strip(): continue
|
||||||
|
d=json.loads(line); cid=d["chat_id"]; pid=d["parent_chat_id"]
|
||||||
|
parents[cid]=pid
|
||||||
|
recs[cid]=(float(d["timestamp"]),int(d["input_length"]),int(d["output_length"]),int(d["turn"]))
|
||||||
|
if pid!="-1": childcount[pid]+=1
|
||||||
|
print(f"[A] records={len(recs)}", file=sys.stderr)
|
||||||
|
|
||||||
|
root_of={}
|
||||||
|
def root(cid):
|
||||||
|
path=[];c=cid
|
||||||
|
while True:
|
||||||
|
if c in root_of:r=root_of[c];break
|
||||||
|
p=parents.get(c,"-1")
|
||||||
|
if p=="-1" or p not in recs:r=c;break
|
||||||
|
path.append(c);c=p
|
||||||
|
for x in path:root_of[x]=r
|
||||||
|
root_of[cid]=r;return r
|
||||||
|
sessions=defaultdict(list)
|
||||||
|
for cid in recs: sessions[root(cid)].append(cid)
|
||||||
|
seq={r:sorted(m,key=lambda c:(recs[c][3],recs[c][0])) for r,m in sessions.items()}
|
||||||
|
print(f"[A] sessions={len(seq)}", file=sys.stderr)
|
||||||
|
|
||||||
|
# ---------- C1: mixture + turn tail + hazard ----------
|
||||||
|
sr=mr=sm=mm=so=mo=0
|
||||||
|
turns_per=[]
|
||||||
|
for r,s in seq.items():
|
||||||
|
multi=len(s)>1; turns_per.append(len(s))
|
||||||
|
for c in s:
|
||||||
|
_,inl,outl,_=recs[c]
|
||||||
|
if multi: mr+=1;mm+=inl;mo+=outl
|
||||||
|
else: sr+=1;sm+=inl;so+=outl
|
||||||
|
tot_r=sr+mr; tot_in=sm+mm; tot_out=so+mo
|
||||||
|
cnt_turn=Counter()
|
||||||
|
for r,s in seq.items():
|
||||||
|
for c in s: cnt_turn[recs[c][3]]+=1
|
||||||
|
hazard={k: (cnt_turn[k+1]/cnt_turn[k] if cnt_turn[k] else 0) for k in range(1,30)}
|
||||||
|
|
||||||
|
# ---------- C2/C3: per-turn resident vs new-prefill (scalar) + hash_ids reuse ----------
|
||||||
|
by_in=defaultdict(list); by_new=defaultdict(list); by_out=defaultdict(list)
|
||||||
|
by_reuse_hash=defaultdict(list) # hash-block prefix stability: reused/parent_blocks
|
||||||
|
store={} # cid -> (blockset, in, out) for chats with pending children
|
||||||
|
tot_new_prefill=0; tot_reused=0
|
||||||
|
for line in open(PATH):
|
||||||
|
if not line.strip(): continue
|
||||||
|
d=json.loads(line); cid=d["chat_id"]; pid=d["parent_chat_id"]
|
||||||
|
inl=int(d["input_length"]); outl=int(d["output_length"]); turn=int(d["turn"])
|
||||||
|
blocks=set(d["hash_ids"])
|
||||||
|
if pid in store:
|
||||||
|
pblk,pin,pout=store[pid]
|
||||||
|
new_prefill=max(0, inl - pin - pout) # actual recompute (accounts for cached answer)
|
||||||
|
reused_blk=len(blocks & pblk)
|
||||||
|
by_reuse_hash[turn].append(reused_blk/len(pblk) if pblk else 0)
|
||||||
|
childcount[pid]-=1
|
||||||
|
if childcount[pid]<=0: del store[pid]
|
||||||
|
tot_reused += (inl-new_prefill)
|
||||||
|
else:
|
||||||
|
new_prefill=inl # session start: all new (intra-session)
|
||||||
|
tot_new_prefill+=new_prefill
|
||||||
|
by_in[turn].append(inl); by_new[turn].append(new_prefill); by_out[turn].append(outl)
|
||||||
|
if childcount[cid]>0: store[cid]=(blocks,inl,outl)
|
||||||
|
print(f"[B] done; store residual={len(store)}", file=sys.stderr)
|
||||||
|
|
||||||
|
TURNS=[t for t in sorted(by_in) if len(by_in[t])>=50]
|
||||||
|
med_in=[pct(by_in[t],.5) for t in TURNS]
|
||||||
|
med_new=[max(pct(by_new[t],.5),1) for t in TURNS]
|
||||||
|
med_out=[pct(by_out[t],.5) for t in TURNS]
|
||||||
|
ratio=[med_in[i]/med_new[i] for i in range(len(TURNS))]
|
||||||
|
reuse_pct=[(1-med_new[i]/med_in[i])*100 for i in range(len(TURNS))]
|
||||||
|
# C3 time per turn (cost model)
|
||||||
|
t_pref=[med_new[i]/PREFILL_TOK_S for i in range(len(TURNS))]
|
||||||
|
t_dec=[med_out[i]*TPOT_S for i in range(len(TURNS))]
|
||||||
|
|
||||||
|
# aggregate decode/prefill time fraction over a RANGE of constants
|
||||||
|
def agg_time(prate,tpot):
|
||||||
|
tp=tot_new_prefill/prate; td=tot_out*tpot; return td/(tp+td)
|
||||||
|
frac_lo=agg_time(13000,0.005); frac_mid=agg_time(7000,0.010); frac_hi=agg_time(3000,0.025)
|
||||||
|
|
||||||
|
chars={
|
||||||
|
"mixture":{"single_sessions":sr if False else sum(1 for s in seq.values() if len(s)==1),
|
||||||
|
"multi_sessions":sum(1 for s in seq.values() if len(s)>1),
|
||||||
|
"req_single_pct":sr/tot_r*100,"req_multi_pct":mr/tot_r*100,
|
||||||
|
"in_single_pct":sm/tot_in*100,"in_multi_pct":mm/tot_in*100,
|
||||||
|
"out_single_pct":so/tot_out*100,"out_multi_pct":mo/tot_out*100},
|
||||||
|
"turns":{"mean":st.mean(turns_per),"p99":pct(turns_per,.99),"max":max(turns_per),
|
||||||
|
"single_turn_pct":sum(1 for x in turns_per if x==1)/len(turns_per)*100},
|
||||||
|
"hazard":hazard,
|
||||||
|
"token_mass":{"total_input":tot_in,"total_output":tot_out,"out_in_ratio_pct":tot_out/tot_in*100,
|
||||||
|
"new_prefill":tot_new_prefill,"reused_prefix":tot_reused,
|
||||||
|
"new_prefill_pct_of_input":tot_new_prefill/tot_in*100},
|
||||||
|
"decode_time_fraction":{"optimistic_for_prefill":frac_lo,"mid":frac_mid,"pessimistic":frac_hi},
|
||||||
|
"per_turn":{"turn":TURNS,"med_resident_input":med_in,"med_new_prefill":med_new,
|
||||||
|
"med_output":med_out,"resident_over_new":ratio,"reuse_pct":reuse_pct},
|
||||||
|
}
|
||||||
|
json.dump(chars, open(f"{OUT}/chars.json","w"), indent=2)
|
||||||
|
|
||||||
|
# ================= FIGURES =================
|
||||||
|
plt.rcParams.update({"figure.dpi":140,"font.size":10,"axes.grid":True,"grid.alpha":.3})
|
||||||
|
|
||||||
|
# ---- C1 ----
|
||||||
|
fig,ax=plt.subplots(1,3,figsize=(15,4.2))
|
||||||
|
cats=["% sessions","% requests","% input\ntokens","% output\ntokens"];
|
||||||
|
singv=[chars["mixture"]["single_sessions"]/len(seq)*100, chars["mixture"]["req_single_pct"],
|
||||||
|
chars["mixture"]["in_single_pct"], chars["mixture"]["out_single_pct"]]
|
||||||
|
multv=[100-x for x in singv]
|
||||||
|
x=np.arange(len(cats))
|
||||||
|
ax[0].bar(x,singv,label="single-turn",color="#7fb3d5")
|
||||||
|
ax[0].bar(x,multv,bottom=singv,label="multi-turn",color="#e74c3c")
|
||||||
|
for i in range(len(cats)):
|
||||||
|
ax[0].text(i,singv[i]/2,f"{singv[i]:.0f}",ha="center",va="center",fontsize=9)
|
||||||
|
ax[0].text(i,singv[i]+multv[i]/2,f"{multv[i]:.0f}",ha="center",va="center",color="white",fontsize=9)
|
||||||
|
ax[0].set_xticks(x);ax[0].set_xticklabels(cats);ax[0].set_ylabel("%");ax[0].set_ylim(0,100)
|
||||||
|
ax[0].set_title("C1a Mixture: 90% sessions single-turn,\nbut multi-turn carries 2/3 prefill mass");ax[0].legend(loc="center right")
|
||||||
|
# turn CCDF log-log
|
||||||
|
tc=sorted(turns_per); n=len(tc); xs=sorted(set(tc))
|
||||||
|
ccdf=[sum(1 for v in tc if v>=xx)/n for xx in xs]
|
||||||
|
ax[1].loglog(xs,ccdf,marker=".",ms=3,color="#34495e")
|
||||||
|
ax[1].set_xlabel("turns per session (k)");ax[1].set_ylabel("P(turns >= k)")
|
||||||
|
ax[1].set_title(f"C1b Heavy-tailed session length\n(p99={chars['turns']['p99']:.0f}, max={chars['turns']['max']})")
|
||||||
|
# hazard
|
||||||
|
hk=list(range(1,20)); hv=[hazard[k]*100 for k in hk]
|
||||||
|
ax[2].plot(hk,hv,marker="o",color="#16a085")
|
||||||
|
ax[2].set_xlabel("reached turn k");ax[2].set_ylabel("P(continue to k+1) %");ax[2].set_ylim(0,100)
|
||||||
|
ax[2].set_title("C1c Continuation hazard rises 10%->94%\n(unpredictable at start, Lindy after)")
|
||||||
|
fig.tight_layout(); fig.savefig(f"{OUT}/c1_session_mixture.png"); plt.close(fig)
|
||||||
|
|
||||||
|
# ---- C2 ----
|
||||||
|
fig,ax=plt.subplots(1,3,figsize=(15,4.2))
|
||||||
|
ax[0].semilogy(TURNS,med_in,marker="o",label="resident context (input)",color="#e74c3c")
|
||||||
|
ax[0].semilogy(TURNS,med_new,marker="s",label="new prefill this turn",color="#2980b9")
|
||||||
|
ax[0].set_xlabel("turn");ax[0].set_ylabel("tokens (median, log)");ax[0].legend()
|
||||||
|
ax[0].set_xlim(1,30)
|
||||||
|
ax[0].set_title("C2a Resident state explodes,\nmarginal work collapses")
|
||||||
|
ax[1].plot(TURNS,ratio,marker="o",color="#8e44ad")
|
||||||
|
ax[1].set_xlabel("turn");ax[1].set_ylabel("resident / new-prefill");ax[1].set_xlim(1,30)
|
||||||
|
ax[1].set_title("C2b The PD tax = resident/delta\n(grows to ~250x by deep turns)")
|
||||||
|
ax[2].plot(TURNS,reuse_pct,marker="o",color="#27ae60")
|
||||||
|
ax[2].set_xlabel("turn");ax[2].set_ylabel("per-turn reuse %");ax[2].set_ylim(50,100);ax[2].set_xlim(1,30)
|
||||||
|
ax[2].set_title("C2c Per-turn reuse climbs to 99.6%\n(deep turns are near-pure cache hits)")
|
||||||
|
fig.tight_layout(); fig.savefig(f"{OUT}/c2_work_amortization.png"); plt.close(fig)
|
||||||
|
|
||||||
|
# ---- C3 ----
|
||||||
|
fig,ax=plt.subplots(1,2,figsize=(11,4.4))
|
||||||
|
# token mass decomposition
|
||||||
|
vals=[tot_reused/1e9, tot_new_prefill/1e9, tot_out/1e9]
|
||||||
|
labs=[f"reused prefix\n{tot_reused/tot_in*100:.0f}% of input",
|
||||||
|
f"new prefill\n{tot_new_prefill/tot_in*100:.0f}% of input",
|
||||||
|
f"decode output\n{tot_out/tot_in*100:.1f}% of input"]
|
||||||
|
ax[0].bar(range(3),vals,color=["#95a5a6","#2980b9","#e67e22"])
|
||||||
|
ax[0].set_xticks(range(3));ax[0].set_xticklabels(labs,fontsize=8.5)
|
||||||
|
ax[0].set_ylabel("tokens (billions)")
|
||||||
|
ax[0].set_title("C3a Token mass: prefill-dominated\n(but tokens != time, see C3b)")
|
||||||
|
# per-turn prefill vs decode TIME (cost model)
|
||||||
|
ax[1].semilogy(TURNS,t_pref,marker="o",label="prefill time (new tok / 7k·s⁻¹)",color="#2980b9")
|
||||||
|
ax[1].semilogy(TURNS,t_dec,marker="s",label="decode time (out·10ms)",color="#e67e22")
|
||||||
|
ax[1].set_xlabel("turn");ax[1].set_ylabel("seconds (median, log)");ax[1].legend(fontsize=8);ax[1].set_xlim(1,30)
|
||||||
|
ax[1].set_title(f"C3b Prefill→decode bottleneck flips within a session\n(agg decode-time share ≈ {frac_mid*100:.0f}%, range {frac_lo*100:.0f}–{frac_hi*100:.0f}%)")
|
||||||
|
fig.tight_layout(); fig.savefig(f"{OUT}/c3_prefill_decode_balance.png"); plt.close(fig)
|
||||||
|
print("FIGURES + chars.json written to", OUT)
|
||||||
|
print(json.dumps({k:chars[k] for k in ["mixture","turns","token_mass","decode_time_fraction"]}, indent=2))
|
||||||
BIN
figs/workload_chars/c1_session_mixture.png
Normal file
BIN
figs/workload_chars/c1_session_mixture.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 111 KiB |
BIN
figs/workload_chars/c2_work_amortization.png
Normal file
BIN
figs/workload_chars/c2_work_amortization.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 108 KiB |
BIN
figs/workload_chars/c3_prefill_decode_balance.png
Normal file
BIN
figs/workload_chars/c3_prefill_decode_balance.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 89 KiB |
Reference in New Issue
Block a user