data: include qwen35-swebench-50sess trace under third_party/traces/

Add the 54 MB SWE 50sess replay trace to the repo under
third_party/traces/ so it travels with `git clone` to GPU nodes that
can't reach the sandbox network. Previously the trace only lived under
outputs/ which is .gitignored.

Whitelist third_party/traces/ in .gitignore (same pattern as the
existing third_party/sglang/ allowlist).

After cloning on a new host, either symlink the file into outputs/ for
backward compatibility:
  ln -sf ../third_party/traces/qwen35-swebench-50sess.jsonl \
         outputs/qwen35-swebench-50sess.jsonl
or update sweep scripts to point --trace at third_party/traces/.

README in the new directory documents the file's lineage
(SiCo → SiBench → audit.jsonl → convert_audit_to_trace.py) and the
100 MB GitLab single-file limit warning for future trace additions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
kzlin
2026-05-13 14:04:54 +08:00
parent 314c4cda0e
commit 8fc31be605
3 changed files with 4485 additions and 0 deletions

4
.gitignore vendored
View File

@@ -13,6 +13,10 @@ src/*.egg-info
outputs/
# Vendored dependencies. Track only the maintained SGLang fork/snapshot.
# third_party/traces/ holds the replay trace files used by the benchmark
# (~56 MB each) for convenient transfer between hosts; they would otherwise
# live under outputs/ but outputs/ is gitignored.
third_party/*
!third_party/sglang/
!third_party/traces/
*.log

32
third_party/traces/README.md vendored Normal file
View File

@@ -0,0 +1,32 @@
# Replay traces
为了方便跨主机传输,把 benchmark 用到的 trace 文件放在这里。该目录在
`.gitignore` 中显式 whitelist`third_party/sglang/`),文件随 git 一起走。
## 文件清单
| 文件 | 大小 | 内容 | 来源 |
|---|---:|---|---|
| `qwen35-swebench-50sess.jsonl` | 54 MB | 4449 reqs / 52 sessions / Qwen3.5-35B 推理产物 | `simm-swe-bench` 项目用 SiBench replay SiCo `swe.jsonl` 经 SGLang 跑出 audit.jsonl再用 `scripts/convert_audit_to_trace.py` 转 |
详细来源见 `docs/ONBOARDING_NEXT_AGENT_ZH.md` 和实际 schema 见 `src/agentic_pd_hybrid/trace.py`
## 使用方法
Replay 端的 trace 路径由 CLI flag `--trace` 指定。默认 sweep 脚本里指向
`outputs/qwen35-swebench-50sess.jsonl`——为了向后兼容老脚本,**建议在 clone 后
软链接一份过去**
```bash
mkdir -p outputs
ln -sf ../third_party/traces/qwen35-swebench-50sess.jsonl \
outputs/qwen35-swebench-50sess.jsonl
```
或者直接改 sweep 脚本里 `--trace` 路径指向 `third_party/traces/...`
## 添加新 trace
如果未来加新 trace 文件(如 `codex_swebenchpro` 转换后的版本),直接放本目录,
更新本 README 的清单即可。**别把超过 100 MB 的单文件直接 git add**——GitLab
默认对未启用 LFS 的单文件有 100 MB 限制。

File diff suppressed because one or more lines are too long