Sources

Checked on 2026-06-24.

Local Repositories

Source	Local path	Commit / HEAD	Notes
Qwen Bailian usage traces	`/home/gahow/phd/qwen-bailian-usagetraces-anon`	`5f7439c51ec248a0c585f7d90a41a6f57773b912`	Primary RS0 input is `qwen_coder_blksz_16.jsonl`.
Frontier	`/tmp/toc-llm-sim-research/Frontier`	`d9cfeb6d8791fbf2f295dd9744c56a666171776e`	Primary RS1 simulator candidate.
Vidur	`/tmp/toc-llm-sim-research/vidur`	`8383d2935bc62723a212090baa9f98ada206fc14`	Baseline simulator candidate for arrival and length replay.
AIConfigurator	`/tmp/toc-llm-sim-research/aiconfigurator`	`e46ece7510e727fafefb8212e5846172145a30ea`	Configuration search reference, not per-request faithful replay.

All four local repositories were present when RS0 was generated. No external repository was cloned for RS0.

Frontier trace replay reads CSV columns arrived_at, num_prefill_tokens, and num_decode_tokens.
It also parses optional session_id and block_hash_ids; block_hash_ids can be | separated, matching examples/fixtures/prefix_cache_shared_session_trace.csv.
Frontier's trace replay generator can clip prefill tokens when total tokens exceed trace_request_generator_config_max_tokens. ReplayServe fixtures hard fail before Frontier sees the trace, so the RS1 smoke cannot silently clip.
Frontier has a built-in Qwen/Qwen3-32B model config.
Frontier has A800 network profiles: data/profiling/network/a800_dgx/ and data/profiling/network/a800_pairwise_nvlink/.
Current public A800 compute profiles in this checkout include Llama2-7B and Qwen3 MoE / Qwen3-Next reduced variants, but no dense Qwen/Qwen3-32B compute profile. RS1 Qwen3-32B A800 latency and throughput results are only plumbing smoke until matching compute profiles or calibration data are added.

The released JSONL rows contain chat_id, parent_chat_id, timestamp, input_length, output_length, type, turn, and hash_ids.
The trace README documents hash_ids as salted SipHash blocks with 16 tokens per block.
The released input lengths and hashes are already after the model-specific chat template has been applied. ReplayServe does not apply chat templates.
The final input block can be padded. ReplayServe records per-block token counts in the sidecar so partial final blocks can be accounted for by true token count.