3dc343684d
fix: enforce integer scores matching rubric scoring_guide
...
- Prompt now explicitly requires integer scores (0/1/2/3/4)
- Code rounds any decimal scores to nearest integer
- Prevents LLM from giving 2.5, 3.5 etc.
2025-12-02 14:36:33 +08:00
c25bcfddd0
fix: improve LLM prompt to use actual rubric score ranges
...
Previous prompt had hardcoded 0-3 score examples which misled LLM.
Now prompt instructs LLM to read max_score from rubric for each criterion.
2025-12-02 14:21:42 +08:00
38b6d5498d
fix: calculate total from criteria scores instead of trusting LLM
...
LLM sometimes returns inconsistent total values. Now we always
compute total = sum(criteria.score) for accuracy.
2025-12-02 14:19:16 +08:00
9de97bddf1
fix: allow --question to be a string instead of file path
2025-12-01 23:46:38 +08:00
d0a4992cd6
Initial commit
2025-12-01 22:12:02 +08:00