Initial project scaffold

2026-04-10 13:22:19 +00:00
commit 7fa69b1354
94 changed files with 3964 additions and 0 deletions
--- a/tasks/00_env_sanity/init.py
+++ b/tasks/00_env_sanity/init.py
@@ -0,0 +1,2 @@
+"""Environment sanity task."""
+
--- a/tasks/00_env_sanity/checklist.md
+++ b/tasks/00_env_sanity/checklist.md
@@ -0,0 +1,13 @@
+# Environment Checklist
+
+- PyTorch imports successfully
+- `torch.cuda.is_available()` is `True`
+- At least one CUDA device is visible
+- The GPU name matches the machine you expect to be using
+- Device capability is printed and recorded
+- Triton imports successfully, or you know why it does not
+- `torch.version.cuda` is visible when using CUDA-enabled PyTorch
+- `nvcc --version` works if you plan to build the CUDA extension
+- `nvidia-smi` works if the driver stack is installed
+
+If any line above fails, fix that before working on later tasks.
--- a/tasks/00_env_sanity/spec.md
+++ b/tasks/00_env_sanity/spec.md
@@ -0,0 +1,46 @@
+# Task 00: Environment Sanity
+
+## 1. Problem Statement
+
+Confirm that your machine can see the GPU software stack needed for the rest of the lab.
+
+## 2. Expected Input/Output Shapes
+
+This task is informational rather than tensor-shaped. The outputs are environment facts:
+
+- PyTorch version
+- CUDA availability
+- Triton import status
+- GPU name
+- device capability
+- toolkit and driver hints when available
+
+## 3. Performance Intuition
+
+Do not benchmark anything yet. First confirm that the environment is what you think it is.
+
+## 4. Memory Access Discussion
+
+Not applicable yet. The point is to avoid debugging kernels when the real problem is a mismatched driver or toolkit.
+
+## 5. What Triton Is Abstracting
+
+Even importing Triton depends on a compatible Python, PyTorch, driver, and GPU stack.
+
+## 6. What CUDA Makes Explicit
+
+CUDA makes the toolkit and architecture targeting explicit. Keep that explicit throughout this repo.
+
+## 7. Reflection Questions
+
+- What exact GPU name does the system report?
+- What device capability does PyTorch report?
+- Does Triton import cleanly?
+- Which part of the stack would you inspect first if CUDA is unavailable?
+
+## 8. Implementation Checklist
+
+- Run `python tools/check_env.py`
+- Run `python tools/print_device_info.py`
+- Write down the reported capability
+- Set `KERNEL_LAB_CUDA_ARCH` explicitly if you need to change architecture targeting