Initial commit: obsidian to gitea

2026-05-07 15:04:41 +08:00
commit a57afa86b4
323 changed files with 42569 additions and 0 deletions
--- a/phd/weekly-report/24/241222.md
+++ b/phd/weekly-report/24/241222.md
@@ -0,0 +1,16 @@
+Objective
+- Serverless KVCache cache
+
+Key Results
+- Implement the workload aware policy in vLLM [8/10]
+- Profile the workload aware policy [3/10]
+- Supply workloads difference in Qwen trace
+
+Last Week
+- Add new design point to cache policy, making the policy to consider cache memory size and predicted reuse distance together. To do this, add a new monitor for workloads' reuse time interval and average number of tokens.
+- Set a offline (i.e. best) scheduling policy, profile the default policy, our workload aware policy and offline policy to show the performance difference in CDF of  TTFT.
+- Implement a cache block source tracker in vLLM to show where the KVCache reuse comes from. Prove that 90% of KVCache reuse comes from multi turns chat.
+
+Next Week
+- Improve the performance of our policy.
+- Plot some formal figures.