- Grading scripts are now fetched from tests repo at runtime
- Students cannot modify grading scripts to cheat
- Workflow fetches scripts from private tests repo
- Prompt now explicitly requires integer scores (0/1/2/3/4)
- Code rounds any decimal scores to nearest integer
- Prevents LLM from giving 2.5, 3.5 etc.