# Sources Checked on 2026-06-24. ## Local Repositories | Source | Local path | Commit / HEAD | Notes | |---|---|---|---| | Qwen Bailian usage traces | `/home/gahow/phd/qwen-bailian-usagetraces-anon` | `5f7439c51ec248a0c585f7d90a41a6f57773b912` | Primary RS0 input is `qwen_coder_blksz_16.jsonl`. | | Frontier | `/tmp/toc-llm-sim-research/Frontier` | `d9cfeb6d8791fbf2f295dd9744c56a666171776e` | Primary RS1 simulator candidate. | | Vidur | `/tmp/toc-llm-sim-research/vidur` | `8383d2935bc62723a212090baa9f98ada206fc14` | Baseline simulator candidate for arrival and length replay. | | AIConfigurator | `/tmp/toc-llm-sim-research/aiconfigurator` | `e46ece7510e727fafefb8212e5846172145a30ea` | Configuration search reference, not per-request faithful replay. | All four local repositories were present when RS0 was generated. No external repository was cloned for RS0. ## Frontier Findings - Frontier trace replay reads CSV columns `arrived_at`, `num_prefill_tokens`, and `num_decode_tokens`. - It also parses optional `session_id` and `block_hash_ids`; `block_hash_ids` can be `|` separated, matching `examples/fixtures/prefix_cache_shared_session_trace.csv`. - Frontier's trace replay generator can clip prefill tokens when total tokens exceed `trace_request_generator_config_max_tokens`. ReplayServe fixtures hard fail before Frontier sees the trace, so the RS1 smoke cannot silently clip. - Frontier has a built-in `Qwen/Qwen3-32B` model config. - Frontier has A800 network profiles: `data/profiling/network/a800_dgx/` and `data/profiling/network/a800_pairwise_nvlink/`. - Current public A800 compute profiles in this checkout include Llama2-7B and Qwen3 MoE / Qwen3-Next reduced variants, but no dense `Qwen/Qwen3-32B` compute profile. RS1 Qwen3-32B A800 latency and throughput results are only plumbing smoke until matching compute profiles or calibration data are added. ## Qwen Trace Findings - The released JSONL rows contain `chat_id`, `parent_chat_id`, `timestamp`, `input_length`, `output_length`, `type`, `turn`, and `hash_ids`. - The trace README documents `hash_ids` as salted SipHash blocks with 16 tokens per block. - The released input lengths and hashes are already after the model-specific chat template has been applied. ReplayServe does not apply chat templates. - The final input block can be padded. ReplayServe records per-block token counts in the sidecar so partial final blocks can be accounted for by true token count.