|
|
c25bcfddd0
|
fix: improve LLM prompt to use actual rubric score ranges
Previous prompt had hardcoded 0-3 score examples which misled LLM.
Now prompt instructs LLM to read max_score from rubric for each criterion.
|
2025-12-02 14:21:42 +08:00 |
|
|
|
38b6d5498d
|
fix: calculate total from criteria scores instead of trusting LLM
LLM sometimes returns inconsistent total values. Now we always
compute total = sum(criteria.score) for accuracy.
|
2025-12-02 14:19:16 +08:00 |
|
|
|
9de97bddf1
|
fix: allow --question to be a string instead of file path
|
2025-12-01 23:46:38 +08:00 |
|
|
|
d0a4992cd6
|
Initial commit
|
2025-12-01 22:12:02 +08:00 |
|