fix: improve LLM prompt to use actual rubric score ranges

Previous prompt had hardcoded 0-3 score examples which misled LLM.
Now prompt instructs LLM to read max_score from rubric for each criterion.
This commit is contained in:
sit002 2025-12-02 14:21:41 +08:00
parent 1afc2eae48
commit c25bcfddd0

View File

@ -36,19 +36,25 @@ def read_file_or_string(value):
PROMPT_TEMPLATE = """你是严格且一致的助教,按提供的评分量表为学生的简答题评分。 PROMPT_TEMPLATE = """你是严格且一致的助教,按提供的评分量表为学生的简答题评分。
- 只依据量表不做主观延伸允许多样表述 - 只依据量表中各评分项的 max_score scoring_guide 进行评分
- 不输出任何解释性文本只输出 JSON包含: - 每个评分项的分数范围是 0 到该项的 max_score
{{ - 不输出任何解释性文本只输出 JSON
"total": number(0-10, 两位小数),
"criteria": [ 输出格式
{{"id":"accuracy","score":0-3,"reason":"要点式一句话"}}, {{
{{"id":"coverage","score":0-3,"reason":""}}, "total": number (各项分数之和保留两位小数),
{{"id":"clarity","score":0-3,"reason":""}} "criteria": [
], {{"id": "评分项id", "score": number(0到该项max_score), "reason": "简短评语"}},
"flags": [], ...
"confidence": number(0-1) ],
}} "flags": [],
如果答案与题目无关total=0并加 flag "need_review" "confidence": number(0-1, 评分置信度)
}}
重要
- 每个评分项的 score 必须在 0 到该项 max_score 范围内
- total 必须等于所有 criteria score 之和
- 如果答案与题目无关或为空total=0并加 flag "need_review"
题目 题目
<<<{question}>>> <<<{question}>>>